Publication Type

Conference Proceeding Article

Version

publishedVersion

Publication Date

12-2023

Abstract

Topic Modelling is an established research area where the quality of a given topic is measured using coherence metrics. Often, we infer topics from Neural Topic Models (NTM) by interpreting their decoder weights, consisting of top-activated words projected from individual neurons. Transformer-based Language Models (TLM) similarly consist of decoder weights. However, due to its hypothesised superposition properties, the final logits originating from the residual path are considered uninterpretable. Therefore, we posit that we can interpret TLM as superposed NTM by proposing a novel weight-based, model-agnostic and corpus-agnostic approach to search and disentangle decoder-only TLM, potentially mapping individual neurons to multiple coherent topics. Our results show that it is empirically feasible to disentangle coherent topics from GPT-2 models using the Wikipedia corpus. We validate this approach for GPT-2 models using Zero-Shot Topic Modelling. Finally, we extend the proposed approach to disentangle and analyse LLaMA models.

Discipline

Databases and Information Systems | Programming Languages and Compilers

Research Areas

Data Science and Engineering

Publication

Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, Singapore, December 6-10

First Page

8646

Last Page

8666

Publisher

ACL

City or Country

Singapore

Citation

LIM, Jia Peng and LAUW, Hady Wirawan. Disentangling transformer language models as superposed topic models. (2023). Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, Singapore, December 6-10. 8646-8666.
Available at: https://ink.library.smu.edu.sg/sis_research/8470

Copyright Owner and License

Authors

Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.

Additional URL

https://aclanthology.org/2023.emnlp-main.534

Download

Included in

Databases and Information Systems Commons, Programming Languages and Compilers Commons

COinS

Research Collection School Of Computing and Information Systems

Disentangling transformer language models as superposed topic models

Publication Type

Version

Publication Date

Abstract

Discipline

Research Areas

Publication

First Page

Last Page

Publisher

City or Country

Citation

Copyright Owner and License

Creative Commons License

Additional URL

Included in

Search

Links

Browse

Links

Research Collection School Of Computing and Information Systems

Disentangling transformer language models as superposed topic models

Author

Publication Type

Version

Publication Date

Abstract

Discipline

Research Areas

Publication

First Page

Last Page

Publisher

City or Country

Citation

Copyright Owner and License

Creative Commons License

Additional URL

Included in

Share

Search

Links

Browse

Links