Publication Type

Conference Proceeding Article

Version

publishedVersion

Publication Date

2-2020

Abstract

Event detection is a crucial and challenging sub-task of event extraction, which suffers from a severe ambiguity issue of trigger words. Existing works mainly focus on using textual context information, while there naturally exist many images accompanied by news articles that are yet to be explored. We believe that images not only reflect the core events of the text, but are also helpful for the disambiguation of trigger words. In this paper, we first contribute an image dataset supplement to ED benchmarks (i.e., ACE2005) for training and evaluation. We then propose a novel Dual Recurrent Multimodal Model, DRMM, to conduct deep interactions between images and sentences for modality features aggregation. DRMM utilizes pre-trained BERT and ResNet to encode sentences and images, and employs an alternating dual attention to select informative features for mutual enhancements. Our superior performance compared to six state-of-art baselines as well as further ablation studies demonstrate the significance of image modality and effectiveness of the proposed architecture. The code and image dataset are avaliable at https://github.com/ shuaiwa16/image-enhanced-event-extraction.

Discipline

Databases and Information Systems | Graphics and Human Computer Interfaces

Research Areas

Data Science and Engineering

Publication

Proceedings of the 34th AAAI Conference on Artificial Intelligence, Virtual Conference, New York, 2020 February 7-12

First Page

9040

Last Page

9047

ISBN

9781577358350

Identifier

10.1609/aaai.v34i05.6437

Publisher

AAAI

City or Country

New York

Additional URL

http://doi.org/10.1609/aaai.v34i05.6437

Share

COinS