A Systematic Exploration of the Feature Space for Relation Extraction
Publication Type
Conference Proceeding Article
Publication Date
4-2007
Abstract
Relation extraction is the task of finding semantic relations between entities from text. The state-of-the-art methods for relation extraction are mostly based on statistical learning, and thus all have to deal with feature selection, which can significantly affect the classification performance. In this paper, we systematically explore a large space of features for relation extraction and evaluate the effectiveness of different feature subspaces. We present a general definition of feature spaces based on a graphic representation of relation instances, and explore three different representations of relation instances and features of different complexities within this framework. Our experiments show that using only basic unit features is generally sufficient to achieve state-of-the-art performance, while overinclusion of complex features may hurt the performance. A combination of features of different levels of complexity and from different sentence representations, coupled with task-oriented feature pruning, gives the best performance.
Discipline
Databases and Information Systems | Numerical Analysis and Scientific Computing
Publication
Human language technologies 2007: The conference of the North American Chapter of the Association for Computational Linguistics; 22 - 27 April 2007, Rochester, New York
First Page
113
Last Page
120
ISBN
9781932432916
Publisher
ACL
City or Country
Rochester, NY, USA
Citation
JIANG, Jing and ZHAI, ChengXiang.
A Systematic Exploration of the Feature Space for Relation Extraction. (2007). Human language technologies 2007: The conference of the North American Chapter of the Association for Computational Linguistics; 22 - 27 April 2007, Rochester, New York. 113-120.
Available at: https://ink.library.smu.edu.sg/sis_research/1254
Additional URL
http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.80.7503