Research Collection School Of Computing and Information Systems

Dual formulations for optimizing Dec-POMDP controllers

Akshat KUMAR, Singapore Management UniversityFollow
Hala MOSTAFA, University of Massachusetts Amherst
Shlomo ZILBERSTEIN, University of Massachusetts Amherst

Publication Type

Conference Proceeding Article

Version

publishedVersion

Publication Date

6-2016

Abstract

Decentralized POMDP is an expressive model for multi-agent planning. Finite-state controllers (FSCs)---often used to represent policies for infinite-horizon problems---offer a compact, simple-to-execute policy representation. We exploit novel connections between optimizing decentralized FSCs and the dual linear program for MDPs. Consequently, we describe a dual mixed integer linear program (MIP) for optimizing deterministic FSCs. We exploit the Dec-POMDP structure to devise a compact MIP and formulate constraints that result in policies executable in partially-observable decentralized settings. We show analytically that the dual formulation can also be exploited within the expectation maximization (EM) framework to optimize stochastic FSCs. The resulting EM algorithm can be implemented by solving a sequence of linear programs, without requiring expensive message-passing over the Dec-POMDP DBN. We also present an efficient technique for policy improvement based on a weighted entropy measure. Compared with state-of-the-art FSC methods, our approach offers over an order-of-magnitude speedup, while producing similar or better solutions.

Keywords

Maximum principle, Message passing, Multi agent systems, Scheduling, Solar concentrators, Stochastic systems

Discipline

Artificial Intelligence and Robotics | Computer Sciences | Operations Research, Systems Engineering and Industrial Engineering

Research Areas

Intelligent Systems and Optimization

Publication

Proceedings of the 26th International Conference on Automated Planning and Scheduling ICAPS 2016, London, June 12-17

First Page

202

Last Page

210

Publisher

AAAI Press

City or Country

Palo Alto, CA

Citation

Akshat KUMAR; MOSTAFA, Hala; and ZILBERSTEIN, Shlomo. Dual formulations for optimizing Dec-POMDP controllers. (2016). Proceedings of the 26th International Conference on Automated Planning and Scheduling ICAPS 2016, London, June 12-17. 202-210.
Available at: https://ink.library.smu.edu.sg/sis_research/3395

Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.

Additional URL

http://www.aaai.org/ocs/index.php/ICAPS/ICAPS16/paper/view/13124

Download

Find it in your library

Included in

Artificial Intelligence and Robotics Commons, Operations Research, Systems Engineering and Industrial Engineering Commons

COinS

Research Collection School Of Computing and Information Systems

Dual formulations for optimizing Dec-POMDP controllers

Publication Type

Version

Publication Date

Abstract

Keywords

Discipline

Research Areas

Publication

First Page

Last Page

Publisher

City or Country

Citation

Creative Commons License

Additional URL

Included in

Search

Links

Browse

Links

Research Collection School Of Computing and Information Systems

Dual formulations for optimizing Dec-POMDP controllers

Author

Publication Type

Version

Publication Date

Abstract

Keywords

Discipline

Research Areas

Publication

First Page

Last Page

Publisher

City or Country

Citation

Creative Commons License

Additional URL

Included in

Share

Search

Links

Browse

Links