Publication Type
Conference Proceeding Article
Version
acceptedVersion
Publication Date
11-2021
Abstract
As a well-established probabilistic method, topic models seek to uncover latent semantics from plain text. In addition to having textual content, we observe that documents are usually compared in listwise rankings based on their content. For instance, world-wide countries are compared in an international ranking in terms of electricity production based on their national reports. Such document comparisons constitute additional information that reveal documents' relative similarities. Incorporating them into topic modeling could yield comparative topics that help to differentiate and rank documents. Furthermore, based on different comparison criteria, the observed document comparisons usually cover multiple aspects, each expressing a distinct ranked list. For example, a country may be ranked higher in terms of electricity production, but fall behind others in terms of life expectancy or government budget. Each comparison criterion, or aspect, observes a distinct ranking. Considering such multiple aspects of comparisons based on different ranking criteria allows us to derive one set of topics that inform heterogeneous document similarities. We propose a generative topic model aimed at learning topics that are well aligned to multi-aspect listwise comparisons. Experiments on public datasets demonstrate the advantage of the proposed method in jointly modeling topics and ranked lists against baselines comprehensively.
Keywords
Generative Topic Model, Text Mining, Comparative Documents
Discipline
Databases and Information Systems | Data Science
Research Areas
Data Science and Engineering
Publication
CIKM '21: Proceedings of the ACM International Conference on Information and Knowledge Management, November 1-5, Virtual
First Page
2507
Last Page
2516
ISBN
9781450384469
Identifier
10.1145/3459637.3482398
Publisher
ACM
City or Country
New York
Embargo Period
12-13-2021
Citation
ZHANG, Delvin Ce and LAUW, Hady W..
Topic modeling for multi-aspect listwise comparison. (2021). CIKM '21: Proceedings of the ACM International Conference on Information and Knowledge Management, November 1-5, Virtual. 2507-2516.
Available at: https://ink.library.smu.edu.sg/sis_research/6432
Copyright Owner and License
Authors
Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.
Additional URL
https://doi.org/10.1145/3459637.3482398