Publication Type
Conference Proceeding Article
Version
acceptedVersion
Publication Date
12-2022
Abstract
Most Neural Topic Models (NTM) use a variational auto-encoder framework producing K topics limited to the size of the encoder’s output. These topics are interpreted through the selection of the top activated words via the weights or reconstructed vector of the decoder that are directly connected to each neuron. In this paper, we present a model-free two-stage process to reinterpret NTM and derive further insights on the state of the trained model. Firstly, building on the original information from a trained NTM, we generate a pool of potential candidate “composite topics” by exploiting possible co-occurrences within the original set of topics, which decouples the strict interpretation of topics from the original NTM. This is followed by a combinatorial formulation to select a final set of composite topics, which we evaluate for coherence and diversity on a large external corpus. Lastly, we employ a user study to derive further insights on the reinterpretation process.
Keywords
topic modeling, composite topics, empirical study, machine learning, neural networks
Discipline
Artificial Intelligence and Robotics
Research Areas
Data Science and Engineering
Publication
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing
City or Country
USA
Citation
LIM, Jia Peng and LAUW, Hady Wirawan.
Towards reinterpreting neural topic models via composite activations. (2022). Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing.
Available at: https://ink.library.smu.edu.sg/sis_research/7610
Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.
Additional URL
https://preview.aclanthology.org/emnlp-22-ingestion/2022.emnlp-main.242/