Publication Type
Conference Proceeding Article
Version
publishedVersion
Publication Date
7-2022
Abstract
Event extraction aims to identify an event and then extract the arguments participating in the event. Despite the great success in sentencelevel event extraction, events are more naturally presented in the form of documents, with event arguments scattered in multiple sentences. However, a major barrier to promote documentlevel event extraction has been the lack of large-scale and practical training and evaluation datasets. In this paper, we present DocEE, a new document-level event extraction dataset including 27,000+ events, 180,000+ arguments. We highlight three features: largescale manual annotations, fine-grained argument types and application-oriented settings. Experiments show that there is still a big gap between state-of-the-art models and human beings (41% Vs 85% in F1 score), indicating that DocEE is an open issue. DocEE is now available at https://github.com/ tongmeihan1995/DocEE.git.
Discipline
Databases and Information Systems | Graphics and Human Computer Interfaces
Research Areas
Data Science and Engineering
Publication
Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Seattle, July 10-15
First Page
3970
Last Page
3982
Identifier
10.18653/v1/2022.naacl-main.291
Publisher
Association for Computational Linguistics
City or Country
Seattle, WA
Citation
TONG, Meihan; XU, Bin; WANG, Shuai; HAN, Meihuan; CAO, Yixin; ZHU, Jiangqi; CHEN, Siyu; HOU, Lei; and LI, Juanzi.
DocEE: A large-scale and fine-grained benchmark for document-level event extraction. (2022). Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Seattle, July 10-15. 3970-3982.
Available at: https://ink.library.smu.edu.sg/sis_research/7471
Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.
Additional URL
http://doi.org/10.18653/v1/2022.naacl-main.291
Included in
Databases and Information Systems Commons, Graphics and Human Computer Interfaces Commons