Publication Type
Conference Proceeding Article
Version
publishedVersion
Publication Date
1-2019
Abstract
Decentralized MDPs (Dec-MDPs) provide a rigorous framework for collaborative multi-agent sequential decisionmaking under uncertainty. However, their computational complexity limits the practical impact. To address this, we focus on a class of Dec-MDPs consisting of independent collaborating agents that are tied together through a global reward function that depends upon their entire histories of states and actions to accomplish joint tasks. To overcome scalability barrier, our main contributions are: (a) We propose a new actor-critic based Reinforcement Learning (RL) approach for event-based Dec-MDPs using successor features (SF) which is a value function representation that decouples the dynamics of the environment from the rewards; (b) We then present Dec-ESR (Decentralized Event based Successor Representation) which generalizes learning for event-based Dec-MDPs using SF within an end-to-end deep RL framework; (c) We also show that Dec-ESR allows useful transfer of information on related but different tasks, hence bootstraps the learning for faster convergence on new tasks; (d) For validation purposes, we test our approach on a large multi-agent coverage problem which models schedule coordination of agents in a real urban subway network and achieves better quality solutions than previous best approaches
Discipline
Software Engineering
Research Areas
Software and Cyber-Physical Systems
Publication
AAAI Conference on Artificial Intelligence (AAAI)
First Page
6054
Last Page
6061
Identifier
10.1609/aaai.v33i01.33016054
City or Country
Hawaii
Citation
GUPTA, Tarun; KUMAR, Akshat; and PARUCHURI, Praveen.
Successor features based multi-agent RL for event-based decentralized MDPs. (2019). AAAI Conference on Artificial Intelligence (AAAI). 6054-6061.
Available at: https://ink.library.smu.edu.sg/sis_research/5057
Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.
Additional URL
https://doi.org/10.1609/aaai.v33i01.33016054