Research Collection School Of Computing and Information Systems

Decentralized multi-agent reinforcement learning in average-reward dynamic DCOPs

Duc Thien Nguyen, Singapore Management UniversityFollow
William YEOH, New Mexico State University
Hoong Chuin LAU, Singapore Management UniversityFollow
Shlomo Zilberstein, University of Massachusetts - Amherst
Chongjie ZHANG, Massachusetts Institute of Technology

Publication Type

Conference Proceeding Article

Version

publishedVersion

Publication Date

7-2014

Abstract

Researchers have introduced the Dynamic Distributed Constraint Optimization Problem (Dynamic DCOP) formulation to model dynamically changing multi-agent coordination problems, where a dynamic DCOP is a sequence of (static canonical) DCOPs, each partially different from the DCOP preceding it. Existing work typically assumes that the problem in each time step is decoupled from the problems in other time steps, which might not hold in some applications. Therefore, in this paper, we make the following contributions: (i) We introduce a new model, called Markovian Dynamic DCOPs (MD-DCOPs), where the DCOP in the next time step is a function of the value assignments in the current time step; (ii) We introduce two distributed reinforcement learning algorithms, the Distributed RVI Q-learning algorithm and the Distributed R-learning algorithm, that balance exploration and exploitation to solve MD-DCOPs in an online manner; and (iii) We empirically evaluate them against an existing multi-arm bandit DCOP algorithm on dynamic DCOPs.

Discipline

Artificial Intelligence and Robotics | Operations Research, Systems Engineering and Industrial Engineering

Publication

Proceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence: Quebec City, 27-31 July 2014

First Page

1447

Last Page

1455

ISBN

9783540201502

Publisher

AAAI

City or Country

Menlo Park, CA

Citation

Nguyen, Duc Thien; YEOH, William; LAU, Hoong Chuin; Zilberstein, Shlomo; and ZHANG, Chongjie. Decentralized multi-agent reinforcement learning in average-reward dynamic DCOPs. (2014). Proceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence: Quebec City, 27-31 July 2014. 1447-1455.
Available at: https://ink.library.smu.edu.sg/sis_research/2667

Copyright Owner and License

LARC

Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.

Additional URL

https://www.aaai.org/ocs/index.php/AAAI/AAAI14/paper/view/8413

Download

Included in

Artificial Intelligence and Robotics Commons, Operations Research, Systems Engineering and Industrial Engineering Commons

COinS

Research Collection School Of Computing and Information Systems

Decentralized multi-agent reinforcement learning in average-reward dynamic DCOPs

Publication Type

Version

Publication Date

Abstract

Discipline

Publication

First Page

Last Page

ISBN

Publisher

City or Country

Citation

Copyright Owner and License

Creative Commons License

Additional URL

Included in

Search

Links

Browse

Links

Research Collection School Of Computing and Information Systems

Decentralized multi-agent reinforcement learning in average-reward dynamic DCOPs

Author

Publication Type

Version

Publication Date

Abstract

Discipline

Publication

First Page

Last Page

ISBN

Publisher

City or Country

Citation

Copyright Owner and License

Creative Commons License

Additional URL

Included in

Share

Search

Links

Browse

Links