Publication Type
Conference Proceeding Article
Version
acceptedVersion
Publication Date
12-2025
Abstract
We study offline imitation learning (IL) in cooperative multi-agent settings, where demonstrations have unlabeled mixed quality - containing both expert and suboptimal trajectories. Our proposed solution is structured in two stages: trajectory labeling and multi-agent imitation learning, designed jointly to enable effective learning from heterogeneous, unlabeled data. In the first stage, we combine advances in large language models and preference-based reinforcement learning to construct a progressive labeling pipeline that distinguishes expert-quality trajectories. In the second stage, we introduce MisoDICE, a novel multi-agent IL algorithm that leverages these labels to learn robust policies while addressing the computational complexity of large joint state-action spaces. By extending the popular single-agent DICE framework to multi-agent settings with a new value decomposition and mixing architecture, our method yields a convex policy optimization objective and ensures consistency between global and local policies. We evaluate MisoDICE on multiple standard multi-agent RL benchmarks and demonstrate superior performance, especially when expert data is scarce.
Keywords
offline imitation learning, multi-agent reinforcement learning, trajectory labeling, preference-based reinforcement learning, value decomposition
Discipline
Artificial Intelligence and Robotics
Areas of Excellence
Digital transformation
Publication
Proceedings of the 39th Conference on Neural Information Processing Systems (NeurIPS 2025), San Diego, CA, December 2-7
First Page
1
Last Page
43
Publisher
Advances in Neural Information Processing Systems
City or Country
San Diego, US
Citation
BUI, The Viet; MAI, Tien; and NGUYEN, Hong Thanh.
MisoDICE: Multi-agent imitation from mixed-quality demonstrations. (2025). Proceedings of the 39th Conference on Neural Information Processing Systems (NeurIPS 2025), San Diego, CA, December 2-7. 1-43.
Available at: https://ink.library.smu.edu.sg/sis_research/10709
Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.