Publication Type
Conference Proceeding Article
Version
publishedVersion
Publication Date
2-2019
Abstract
A number of real-world applications require comparison of entities based on their textual representations. In this work, we develop a topic model supervised by pairwise comparisons of documents. Such a model seeks to yield topics that help to differentiate entities along some dimension of interest, which may vary from one application to another. While previous supervised topic models consider document labels in an independent and pointwise manner, our proposed Comparative Latent Dirichlet Allocation (CompareLDA) learns predictive topic distributions that comply with the pairwise comparison observations. To fit the model, we derive a maximum likelihood estimation method via augmented variational approximation algorithm. Evaluation on several public datasets underscores the strengths of CompareLDA in modelling document comparisons.
Keywords
topic model, document comparison
Discipline
Databases and Information Systems
Research Areas
Data Science and Engineering
Publication
Proceedings of the 33rd AAAI Conference on Artificial Intelligence 2019: Honolulu, January 27 - February 1
First Page
7112
Last Page
7119
Identifier
10.1609/aaai.v33i01.33017112
Publisher
AAAI Press
City or Country
Menlo Park, CA
Citation
TKACHENKO, Maksim and LAUW, Hady Wirawan.
CompareLDA: A topic model for document comparison. (2019). Proceedings of the 33rd AAAI Conference on Artificial Intelligence 2019: Honolulu, January 27 - February 1. 7112-7119.
Available at: https://ink.library.smu.edu.sg/sis_research/4698
Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.
Additional URL
https://doi.org/10.1609/aaai.v33i01.33017112