Research Collection School Of Computing and Information Systems

Simulation-free hierarchical latent policy planning for proactive dialogues

Publication Type

Conference Proceeding Article

Version

publishedVersion

Publication Date

3-2025

Abstract

Recent advancements in proactive dialogues have garnered significant attention, particularly for more complex objectives (e.g. emotion support and persuasion). Unlike traditional task-oriented dialogues, proactive dialogues demand advanced policy planning and adaptability, requiring rich scenarios and comprehensive policy repositories to develop such systems. However, existing approaches tend to rely on Large Language Models (LLMs) for user simulation and online learning, leading to biases that diverge from realistic scenarios and result in suboptimal efficiency. Moreover, these methods depend on manually defined, context-independent, coarse-grained policies, which not only incur high expert costs but also raise concerns regarding their completeness. In our work, we highlight the potential for automatically discovering policies directly from raw, real-world dialogue records. To this end, we introduce a novel dialogue policy planning framework, LDPP. It fully automates the process from mining policies in dialogue records to learning policy planning. Specifically, we employ a variant of the Variational Autoencoder to discover fine-grained policies represented as latent vectors. After automatically annotating the data with these latent policy labels, we propose an Offline Hierarchical Reinforcement Learning (RL) algorithm in the latent space to develop effective policy planning capabilities. Our experiments demonstrate that LDPP outperforms existing methods on two proactive scenarios, even surpassing ChatGPT with only a 1.8-billion-parameter LLM.

Discipline

Artificial Intelligence and Robotics

Areas of Excellence

Digital transformation

Publication

AAAI'25/IAAI'25/EAAI'25: Proceedings of the Thirty-Ninth AAAI Conference on Artificial Intelligence and Thirty-Seventh Conference on Innovative Applications of Artificial Intelligence and Fifteenth Symposium on Educational Advances in Artificial Intelligence, Philadelphia, Pennsylvania, February 25 - March 4

First Page

24032

Last Page

24040

Identifier

10.1609/aaai.v39i22.34577

Publisher

ACM

City or Country

New York

Citation

HE, Tao; LIAO, Lizi; CAO, Yixin; LIU, Yuanxing; SUN, Yiheng; CHEN, Zerui; LIU, Ming; and QIN, Bing. Simulation-free hierarchical latent policy planning for proactive dialogues. (2025). AAAI'25/IAAI'25/EAAI'25: Proceedings of the Thirty-Ninth AAAI Conference on Artificial Intelligence and Thirty-Seventh Conference on Innovative Applications of Artificial Intelligence and Fifteenth Symposium on Educational Advances in Artificial Intelligence, Philadelphia, Pennsylvania, February 25 - March 4. 24032-24040.
Available at: https://ink.library.smu.edu.sg/sis_research/10764

Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.

Additional URL

https://doi.org/10.1609/aaai.v39i22.34577

Download

Included in

Artificial Intelligence and Robotics Commons

COinS

Research Collection School Of Computing and Information Systems

Simulation-free hierarchical latent policy planning for proactive dialogues

Publication Type

Version

Publication Date

Abstract

Discipline

Areas of Excellence

Publication

First Page

Last Page

Identifier

Publisher

City or Country

Citation

Creative Commons License

Additional URL

Included in

Search

Links

Browse

Links

Research Collection School Of Computing and Information Systems

Simulation-free hierarchical latent policy planning for proactive dialogues

Author

Publication Type

Version

Publication Date

Abstract

Discipline

Areas of Excellence

Publication

First Page

Last Page

Identifier

Publisher

City or Country

Citation

Creative Commons License

Additional URL

Included in

Share

Search

Links

Browse

Links