Publication Type
Conference Proceeding Article
Version
publishedVersion
Publication Date
7-2023
Abstract
Existing research on multimodal relation extraction (MRE) faces two co-existing challenges, internal-information over-utilization and external-information under-exploitation. To combat that, we propose a novel framework that simultaneously implements the idea of internal-information screening and external-information exploiting. First, we represent the fine-grained semantic structures of the input image and text with the visual and textual scene graphs, which are further fused into a unified cross-modal graph (CMG). Based on CMG, we perform structure refinement with the guidance of the graph information bottleneck principle, actively denoising the less-informative features. Next, we perform topic modeling over the input image and text, incorporating latent multimodal topic features to enrich the contexts. On the benchmark MRE dataset, our system outperforms the current best model significantly. With further in-depth analyses, we reveal the great potential of our method for the MRE task.
Keywords
computational linguistics; data mining; extraction; information retrieval
Discipline
Computer Sciences
Research Areas
Intelligent Systems and Optimization
Publication
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics
Volume
Volume 1: Long Papers
First Page
14734
Last Page
14751
Identifier
10.18653/v1/2023.acl-long.823
Publisher
Association for Computational Linguistics
City or Country
Canada
Citation
WU, Shengqiong; FEI, Hao; CAO, Yixin; BING, Lidong; and CHUA, Tat-Seng.
Information screening whilst exploiting! multimodal relation extraction with feature denoising and multimodal topic modeling. (2023). Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics. Volume 1: Long Papers, 14734-14751.
Available at: https://ink.library.smu.edu.sg/sis_research/8261
Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.
Additional URL
https://doi.org/10.18653/v1/2023.acl-long.823