Publication Type

Conference Proceeding Article

Version

publishedVersion

Publication Date

7-2022

Abstract

Event extraction aims to identify an event and then extract the arguments participating in the event. Despite the great success in sentencelevel event extraction, events are more naturally presented in the form of documents, with event arguments scattered in multiple sentences. However, a major barrier to promote documentlevel event extraction has been the lack of large-scale and practical training and evaluation datasets. In this paper, we present DocEE, a new document-level event extraction dataset including 27,000+ events, 180,000+ arguments. We highlight three features: largescale manual annotations, fine-grained argument types and application-oriented settings. Experiments show that there is still a big gap between state-of-the-art models and human beings (41% Vs 85% in F1 score), indicating that DocEE is an open issue. DocEE is now available at https://github.com/ tongmeihan1995/DocEE.git.

Discipline

Databases and Information Systems | Graphics and Human Computer Interfaces

Research Areas

Data Science and Engineering

Publication

Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Seattle, July 10-15

First Page

3970

Last Page

3982

Identifier

10.18653/v1/2022.naacl-main.291

Publisher

Association for Computational Linguistics

City or Country

Seattle, WA

Additional URL

http://doi.org/10.18653/v1/2022.naacl-main.291

Share

COinS