Publication Type
Conference Proceeding Article
Version
publishedVersion
Publication Date
9-2021
Abstract
We study the task of learning and evaluating Chinese idiom embeddings. We first construct a new evaluation dataset that contains idiom synonyms and antonyms. Observing that existing Chinese word embedding methods may not be suitable for learning idiom embeddings, we further present a BERT-based method that directly learns embedding vectors for individual idioms. We empirically compare representative existing methods and our method. We find that our method substantially outperforms existing methods on the evaluation dataset we have constructed.
Discipline
Artificial Intelligence and Robotics | Programming Languages and Compilers
Research Areas
Data Science and Engineering
Publication
Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2021), Virtual Conference, September 1-3
First Page
1387
Last Page
1396
Identifier
10.26615/978-954-452-072-4_155
Publisher
Incoma Ltd.
City or Country
Virtual Conference
Citation
TAN, Minghuan and JIANG, Jing.
Learning and evaluating Chinese idiom embeddings. (2021). Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2021), Virtual Conference, September 1-3. 1387-1396.
Available at: https://ink.library.smu.edu.sg/sis_research/6723
Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.
Included in
Artificial Intelligence and Robotics Commons, Programming Languages and Compilers Commons