Publication Type
Conference Proceeding Article
Version
acceptedVersion
Publication Date
8-2014
Abstract
Visualization of high-dimensional data such as text documents is widely applicable. The traditional means is to find an appropriate embedding of the high-dimensional representation in a low-dimensional visualizable space. As topic modeling is a useful form of dimensionality reduction that preserves the semantics in documents, recent approaches aim for a visualization that is consistent with both the original word space, as well as the semantic topic space. In this paper, we address the semantic visualization problem. Given a corpus of documents, the objective is to simultaneously learn the topic distributions as well as the visualization coordinates of documents. We propose to develop a semantic visualization model that approximates L2-normalized data directly. The key is to associate each document with three representations: a coordinate in the visualization space, a multinomial distribution in the topic space, and a directional vector in a high-dimensional unit hypersphere in the word space. We join these representations in a unified generative model, and describe its parameter estimation through variational inference. Comprehensive experiments on real-life text datasets show that the proposed method outperforms the existing baselines on objective evaluation metrics for visualization quality and topic interpretability.
Keywords
dimensionality reduction, semantic visualization, spherical semantic embedding, spherical space, generative model, L2-normalized vector, topic model
Discipline
Databases and Information Systems | Numerical Analysis and Scientific Computing
Publication
KDD '14: Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, New York, August 24-27
First Page
1007
Last Page
1016
ISBN
9781450329569
Identifier
10.1145/2623330.2623620
Publisher
ACM
City or Country
New York
Citation
LE, Tuan M. V. and LAUW, Hady W..
Semantic Visualization for Spherical Representation. (2014). KDD '14: Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, New York, August 24-27. 1007-1016.
Available at: https://ink.library.smu.edu.sg/sis_research/2250
Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.
Additional URL
https://doi.org/10.1145/2623330.2623620
Included in
Databases and Information Systems Commons, Numerical Analysis and Scientific Computing Commons