Publication Type
Conference Proceeding Article
Version
acceptedVersion
Publication Date
10-2025
Abstract
Recently, Multimodal Large Language Models (MLLMs) have achieved significant success across multiple disciplines due to their exceptional instruction-following capabilities and extensive world knowledge. However, whether these MLLMs possess human-like compositional reasoning abilities remains an open problem. To unveil their reasoning behaviors, we first curate a Multimodal Assumptive Reasoning Benchmark (MARS-Bench) in this paper. Interestingly, we find that most prevalent MLLMs can be easily fooled by the introduction of a presupposition into the question, whereas such presuppositions appear naive to human reasoning. Besides, we also propose a simple yet effective method, Active Deduction (AD), a novel reinforcement learning paradigm to encourage the model to actively perform composite deduction before reaching a final decision. Equipped with the proposed AD method, a MLLM demonstrates significant improvements in assumptive reasoning abilities without compromising its general-purpose question-answering performance. We also provide extensive evaluations of both opensource and private MLLMs on MARS-Bench, along with experimental analyses of the AD method.
Keywords
Assumptive reasoning, MLLMs, VQA, Benchmark, GRPO
Discipline
Graphics and Human Computer Interfaces
Research Areas
Intelligent Systems and Optimization
Areas of Excellence
Digital transformation
Publication
MM '25: The 33rd ACM International Conference on Multimedia, Dublin Ireland, October 27-31
First Page
2713
Last Page
2722
ISBN
9798400720352
Identifier
10.1145/3746027.3754720
Publisher
ACM
City or Country
New York
Citation
LI, Yian; TIAN, Wentao; JIAO, Yang; CHEN, Jingjing; QIAN, Tianwen; ZHU, Bin; ZHAO, Na; and JIANG, Yu‑Gang.
Look before you decide: Prompting active deduction of MLLMs for assumptive reasoning. (2025). MM '25: The 33rd ACM International Conference on Multimedia, Dublin Ireland, October 27-31. 2713-2722.
Available at: https://ink.library.smu.edu.sg/sis_research/10434
Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.
Additional URL
https://doi.org/10.1145/3746027.3754720