Publication Type
Conference Proceeding Article
Version
publishedVersion
Publication Date
7-2023
Abstract
Generating persona consistent dialogue response is important for developing an intelligent conversational agent. Recent works typically fine-tune large-scale pre-trained models on this task by concatenating persona texts and dialogue history as a single input sequence to generate the target response. While simple and effective, our analysis shows that this popular practice is seriously affected by order sensitivity where different input orders of persona sentences significantly impact the quality and consistency of generated response, resulting in severe performance fluctuations (i.e., 29.4% on GPT2 and 83.2% on BART). To mitigate the order sensitivity problem, we propose a model-agnostic framework, ORder Insensitive Generation (ORIG), which enables dialogue models to learn robust representation under different persona orders and improve the consistency of response generation. Experiments on the Persona-Chat dataset justify the effectiveness and superiority of our method with two dominant pre-trained models (GPT2 and BART).
Discipline
Databases and Information Systems
Research Areas
Data Science and Engineering
Areas of Excellence
Digital transformation
Publication
Proceedings of the 2023 Findings of the Association for Computational Linguistics, Toronto, Canada, July 9-14
First Page
7337
Last Page
7345
Identifier
10.18653/v1/2023.findings-acl.462
Publisher
Association for Computational Linguistics
City or Country
USA
Citation
CHEN, Liang; WANG, Hongru; DENG, Yang; KWAN, Wai-Chung; WANG, Zezhong; and WONG, Kam-Fai.
Towards robust personalized dialogue generation via order-insensitive representation regularization. (2023). Proceedings of the 2023 Findings of the Association for Computational Linguistics, Toronto, Canada, July 9-14. 7337-7345.
Available at: https://ink.library.smu.edu.sg/sis_research/9125
Copyright Owner and License
Authors
Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.
Additional URL
https://doi.org/10.18653/v1/2023.findings-acl.462