Publication Type
Conference Proceeding Article
Version
publishedVersion
Publication Date
8-2017
Abstract
Semantic visualization integrates topic modeling and visualization, such that every document is associated with a topic distribution as well as visualization coordinates on a low-dimensional Euclidean space. We address the problem of semantic visualization for short texts. Such documents are increasingly common, including tweets, search snippets, news headlines, or status updates. Due to their short lengths, it is difficult to model semantics as the word co-occurrences in such a corpus are very sparse. Our approach is to incorporate auxiliary information, such as word embeddings from a larger corpus, to supplement the lack of co-occurrences. This requires the development of a novel semantic visualization model that seamlessly integrates visualization coordinates, topic distributions, and word vectors. We propose a model called GaussianSV, which outperforms pipelined baselines that derive topic models and visualization coordinates as disjoint steps, as well as semantic visualization baselines that do not consider word embeddings.
Keywords
Machine Learning, Data Mining, Feature Selection/Construction, Learning Graphical Models
Discipline
Databases and Information Systems | Data Storage Systems
Research Areas
Data Science and Engineering
Publication
Proceedings of the 26th International Joint Conference on Artificial Intelligence IJCAI-17, Melbourne, Australia, August 19-25
First Page
2074
Last Page
2080
ISBN
9780999241103
Identifier
10.24963/ijcai.2017/288
Publisher
IJCAI
City or Country
Vienna
Citation
LE, Van Minh Tuan and LAUW, Hady W..
Semantic visualization for short texts with word embeddings. (2017). Proceedings of the 26th International Joint Conference on Artificial Intelligence IJCAI-17, Melbourne, Australia, August 19-25. 2074-2080.
Available at: https://ink.library.smu.edu.sg/sis_research/3766
Copyright Owner and License
Authors
Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.
Additional URL
https://doi.org/10.24963/ijcai.2017/288