Publication Type
Conference Proceeding Article
Version
publishedVersion
Publication Date
11-2024
Abstract
Detecting multimodal misinformation, especially in the form of image-text pairs, is crucial. Obtaining large-scale, high-quality real-world fact-checking datasets for training detectors is costly, leading researchers to use synthetic datasets generated by AI technologies. However, the generalizability of detectors trained on synthetic data to real-world scenarios remains unclear due to the distribution gap. To address this, we propose learning from synthetic data for detecting real-world multimodal misinformation through two model-agnostic data selection methods that match synthetic and real-world data distributions. Experiments show that our method enhances the performance of a small MLLM (13B) on real-world fact-checking datasets, enabling it to even surpass GPT-4V.
Discipline
Databases and Information Systems
Research Areas
Data Science and Engineering
Areas of Excellence
Digital transformation
Publication
Findings of the Association for Computational Linguistics: EMNLP 2024, Miami, Florida, November 12-16
First Page
10467
Last Page
10484
Identifier
10.18653/v1/2024.findings-emnlp.613
City or Country
USA
Citation
ZENG, Fengzhu; LI, Wenqian; GAO, Wei; and PANG, Yan.
Multimodal misinformation detection by learning from synthetic data with multimodal LLMs. (2024). Findings of the Association for Computational Linguistics: EMNLP 2024, Miami, Florida, November 12-16. 10467-10484.
Available at: https://ink.library.smu.edu.sg/sis_research/9879
Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.
Additional URL
https://doi.org/10.18653/v1/2024.findings-emnlp.613