Synergizing Large Language Models and pre-trained smaller models for conversational intent discovery
Publication Type
Conference Proceeding Article
Version
publishedVersion
Publication Date
8-2024
Abstract
In Conversational Intent Discovery (CID), Small Language Models (SLMs) struggle with overfitting to familiar intents and fail to label newly discovered ones. This issue stems from their limited grasp of semantic nuances and their intrinsically discriminative framework. Therefore, we propose Synergizing Large Language Models (LLMs) with pre-trained SLMs for CID (SynCID). It harnesses the profound semantic comprehension of LLMs alongside the operational agility of SLMs. By utilizing LLMs to refine both utterances and existing intent labels, SynCID significantly enhances the semantic depth, subsequently realigning these enriched descriptors within the SLMs’ feature space to correct cluster distortion and promote robust learning of representations. A key advantage is its capacity for the early identification of new intents, a critical aspect for deploying conversational agents successfully. Additionally, SynCID leverages the in-context learning strengths of LLMs to generate labels for new intents. Thorough evaluations across a wide array of datasets have demonstrated its superior performance over traditional CID methods.
Keywords
Conversational Intent Discovery, CID, Large Language Models, LLMs, Small Language Models, SLMs
Discipline
Artificial Intelligence and Robotics | Computer Sciences
Research Areas
Data Science and Engineering; Intelligent Systems and Optimization
Areas of Excellence
Digital transformation
Publication
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (ACL 2024) : Bangkok, Thailand, August 11-16
First Page
14133
Last Page
14147
Identifier
10.18653/v1/2024.findings-acl.840
Publisher
Association for Computational Linguistics
City or Country
Bangkok, Thailand
Citation
LIANG, Jinggui; LIAO, Lizi; FEI, Hao; and JIANG, Jing.
Synergizing Large Language Models and pre-trained smaller models for conversational intent discovery. (2024). Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (ACL 2024) : Bangkok, Thailand, August 11-16. 14133-14147.
Available at: https://ink.library.smu.edu.sg/sis_research/9698
Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.
Additional URL
https://doi.org/10.18653/v1/2024.findings-acl.840