Publication Type
Conference Proceeding Article
Version
acceptedVersion
Publication Date
12-2021
Abstract
Recent work in multi-agent reinforcement learning (MARL) by [Zhang, ICML12018] provided the first decentralized actor-critic algorithm to offer convergence guarantees. In that work, policies are stochastic and are defined on finite action spaces. We extend those results to develop a provably-convergent decentralized actor-critic algorithm for learning deterministic policies on continuous action spaces. Deterministic policies are important in many real-world settings. To handle the lack of exploration inherent in deterministic policies we provide results for the off-policy setting as well as the on-policy setting. We provide the main ingredients needed for this problem: the expression of a local deterministic policy gradient, a decentralized deterministic actor-critic algorithm, and convergence guarantees when the value functions are approximated linearly. This work enables decentralized MARL in high-dimensional action spaces and paves the way for more widespread application of MARL.
Discipline
Numerical Analysis and Scientific Computing
Research Areas
Intelligent Systems and Optimization
Areas of Excellence
Digital transformation
Publication
Proceedings of the 60th IEEE Conference on Decision and Control, CDC 2021, Austin, TX, December 14-17
First Page
1548
Last Page
1553
ISBN
9781665436595
Identifier
10.1109/CDC45484.2021.9683356
Publisher
IEEE
City or Country
Piscataway, NJ
Citation
GROSNIT, Antoine; CAI, Desmond; and WYNTER, Laura.
Decentralized deterministic multi-agent reinforcement learning. (2021). Proceedings of the 60th IEEE Conference on Decision and Control, CDC 2021, Austin, TX, December 14-17. 1548-1553.
Available at: https://ink.library.smu.edu.sg/sis_research/10361
Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.
Additional URL
https://doi.org/10.1109/CDC45484.2021.9683356