Research Collection School Of Computing and Information Systems

Low-resource name tagging learned with weakly labeled data

Publication Type

Conference Proceeding Article

Version

publishedVersion

Publication Date

11-2019

Abstract

Name tagging in low-resource languages or domains suffers from inadequate training data. Existing work heavily relies on additional information, while leaving those noisy annotations unexplored that extensively exist on the web. In this paper, we propose a novel neural model for name tagging solely based on weakly labeled (WL) data, so that it can be applied in any low-resource settings. To take the best advantage of all WL sentences, we split them into high-quality and noisy portions for two modules, respectively: (1) a classification module focusing on the large portion of noisy data can efficiently and robustly pretrain the tag classifier by capturing textual context semantics; and (2) a costly sequence labeling module focusing on high-quality data utilizes Partial-CRFs with non-entity sampling to achieve global optimum. Two modules are combined via shared parameters. Extensive experiments involving five low-resource languages and fine-grained food domain demonstrate our superior performance (6% and 7.8% F1 gains on average) as well as efficiency.

Discipline

Databases and Information Systems | Graphics and Human Computer Interfaces

Research Areas

Data Science and Engineering

Publication

Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, Hong Kong, November 3-7

First Page

261

Last Page

270

Identifier

10.18653/v1/D19-1025

Publisher

Association for Computational Linguistics

City or Country

Hong Kong, China

Citation

CAO, Yixin; HU, Zikun; CHUA, Tat-Seng; LIU, Zhiyuan; and JI, Heng. Low-resource name tagging learned with weakly labeled data. (2019). Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, Hong Kong, November 3-7. 261-270.
Available at: https://ink.library.smu.edu.sg/sis_research/7457

Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.

Additional URL

http://doi.org/10.18653/v1/D19-1025

Download

Included in

Databases and Information Systems Commons, Graphics and Human Computer Interfaces Commons

COinS

Research Collection School Of Computing and Information Systems

Low-resource name tagging learned with weakly labeled data

Publication Type

Version

Publication Date

Abstract

Discipline

Research Areas

Publication

First Page

Last Page

Identifier

Publisher

City or Country

Citation

Creative Commons License

Additional URL

Included in

Search

Links

Browse

Links

Research Collection School Of Computing and Information Systems

Low-resource name tagging learned with weakly labeled data

Author

Publication Type

Version

Publication Date

Abstract

Discipline

Research Areas

Publication

First Page

Last Page

Identifier

Publisher

City or Country

Citation

Creative Commons License

Additional URL

Included in

Share

Search

Links

Browse

Links