Publication Type
Conference Paper
Version
submittedVersion
Publication Date
2013
Abstract
In this paper, we seek robust policies for uncertain Markov Decision Processes (MDPs). Most robust optimization approaches for these problems have focussed on the computation of maximin policies which maximize the value corresponding to the worst realization of the uncertainty. Recent work has proposed minimax regret as a suitable alternative to the maximin objective for robust optimization. However, existing algorithms for handling minimax regret are restricted to models with uncertainty over rewards only. We provide algorithms that employ sampling to improve across multiple dimensions: (a) Handle uncertainties over both transition and reward models; (b) Dependence of model uncertainties across state, action pairs and decision epochs; (c) Scalability and quality bounds. Finally, to demonstrate the empirical effectiveness of our sampling approaches, we provide comparisons against benchmark algorithms on two domains from literature. We also provide a Sample Average Approximation (SAA) analysis to compute a posteriori error bounds.
Discipline
Artificial Intelligence and Robotics | Business | Operations Research, Systems Engineering and Industrial Engineering
Publication
Advances in Neural Information Processing Systems (NIPS)
City or Country
Lake Tahoe, San Francisco
Citation
AHMED, Asrar; VARAKANTHAM, Pradeep Reddy; Adulyasak, Yossiri; and Jaillet, Patrick.
Regret based Robust Solutions for Uncertain Markov Decision Processes. (2013). Advances in Neural Information Processing Systems (NIPS).
Available at: https://ink.library.smu.edu.sg/sis_research/1932
Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.
Included in
Artificial Intelligence and Robotics Commons, Business Commons, Operations Research, Systems Engineering and Industrial Engineering Commons