Publication Type
Conference Proceeding Article
Version
publishedVersion
Publication Date
8-2025
Abstract
Humans possess a remarkable ability to interpret underspecified ambiguous statements by inferring their meanings from contexts such as visual inputs. This ability, however, may not be as developed in recent pre-trained visionlanguage models (VLMs). In this paper, we introduce a novel probing dataset called FOCUS to evaluate whether state-of-the-art VLMs have this ability. FOCUS consists of underspecified sentences paired with image contexts and carefully designed probing questions. Our experiments reveal that VLMs still fall short in handling underspecification even when visual inputs that can help resolve the ambiguities are available. To further support research in underspecification, FOCUS will be released for public use. We hope this dataset will inspire further research on the reasoning and contextual understanding capabilities of VLMs.
Discipline
Graphics and Human Computer Interfaces | Programming Languages and Compilers
Research Areas
Data Science and Engineering
Publication
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics, ACL 2025, Vienna, Austria, July 27 - August 1
Volume
1
First Page
27565
Last Page
27584
Identifier
10.18653/v1/2025.acl-long.1337
City or Country
Vienna, Austria
Citation
ZHOU, Kankan; LAI, Yibin; MOURATIDIS, Kyriakos; and JIANG, Jing.
FOCUS: Evaluating pre-trained vision-language models on underspecification reasoning. (2025). Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics, ACL 2025, Vienna, Austria, July 27 - August 1. 1, 27565-27584.
Available at: https://ink.library.smu.edu.sg/sis_research/10276
Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.
Additional URL
https://doi.org/10.18653/v1/2025.acl-long.1337
Included in
Graphics and Human Computer Interfaces Commons, Programming Languages and Compilers Commons