Publication Type

Conference Proceeding Article

Version

publishedVersion

Publication Date

12-2017

Abstract

To enhance the expression ability of distributional word representation learning model, many researchers tend to induce word senses through clustering, and learn multiple embedding vectors for each word, namely multi-prototype word embedding model. However, most related work ignores the relatedness among word senses which actually plays an important role. In this paper, we propose a novel approach to capture word sense relatedness in multi-prototype word embedding model. Particularly, we differentiate the original sense and extended senses of a word by introducing their global occurrence information and model their relatedness through the local textual context information. Based on the idea of fuzzy clustering, we introduce a random process to integrate these two types of senses and design two non-parametric methods for word sense induction. To make our model more scalable and efficient, we use an online joint learning framework extended from the Skip-gram model. The experimental results demonstrate that our model outperforms both conventional single-prototype embedding models and other multi-prototype embedding models, and achieves more stable performance when trained on smaller data.

Discipline

Databases and Information Systems | Graphics and Human Computer Interfaces

Research Areas

Data Science and Engineering

Publication

Proceedings of the The 8th International Joint Conference on Natural Language Processing, Taipei, Taiwan, 2017 November 27 - December 1

First Page

233

Last Page

242

Publisher

Association for Computational Linguistics

City or Country

Taipei, Taiwan

Citation

CAO, Yixin; LI, Juanzi; SHI, Jiaxin; LIU, Zhiyuan; and LI, Chengjiang. On modeling sense relatedness in multi-prototype word embedding. (2017). Proceedings of the The 8th International Joint Conference on Natural Language Processing, Taipei, Taiwan, 2017 November 27 - December 1. 233-242.
Available at: https://ink.library.smu.edu.sg/sis_research/7469

Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.

Download

Included in

Databases and Information Systems Commons, Graphics and Human Computer Interfaces Commons

COinS

Research Collection School Of Computing and Information Systems

On modeling sense relatedness in multi-prototype word embedding

Publication Type

Version

Publication Date

Abstract

Discipline

Research Areas

Publication

First Page

Last Page

Publisher

City or Country

Citation

Creative Commons License

Included in

Search

Links

Browse

Links

Research Collection School Of Computing and Information Systems

On modeling sense relatedness in multi-prototype word embedding

Author

Publication Type

Version

Publication Date

Abstract

Discipline

Research Areas

Publication

First Page

Last Page

Publisher

City or Country

Citation

Creative Commons License

Included in

Share

Search

Links

Browse

Links