Publication Type
Conference Proceeding Article
Version
publishedVersion
Publication Date
8-2021
Abstract
Many real world systems involve interaction among large number of agents to achieve a common goal, for example, air traffic control. Several model-free RL algorithms have been proposed for such settings. A key limitation is that the empirical reward signal in model-free case is not very effective in addressing the multiagent credit assignment problem, which determines an agent's contribution to the team's success. This results in lower solution quality and high sample complexity. To address this, we contribute (a) an approach to learn a differentiable reward model for both continuous and discrete action setting by exploiting the collective nature of interactions among agents, a feature commonly present in large scale multiagent applications; (b) a shaped reward model analytically derived from the learned reward model to address the key challenge of credit assignment; (c) a model-based multiagent RL approach that integrates shaped rewards into well known RL algorithms such as policy gradient, soft-actor critic. Compared to previous methods, our learned reward models are more accurate, and our approaches achieve better solution quality on synthetic and real world instances of air traffic control, and cooperative navigation with large agent population.
Keywords
Model representation and learning domain models for planning, Multi-agent planning And learning
Discipline
Databases and Information Systems
Research Areas
Data Science and Engineering
Publication
Proceedings of the Thirty-First International Conference on Automated Planning and Scheduling, Virtual Online, August 2-13
First Page
588
Last Page
596
Publisher
AAAI Press
City or Country
California USA
Citation
SINGH, Arambam James; KUMAR, Akshat; and LAU, Hoong Chuin.
Learning and exploiting shaped reward models for large scale multiagent RL. (2021). Proceedings of the Thirty-First International Conference on Automated Planning and Scheduling, Virtual Online, August 2-13. 588-596.
Available at: https://ink.library.smu.edu.sg/sis_research/6899
Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.