Publication Type
Conference Proceeding Article
Version
publishedVersion
Publication Date
2-2017
Abstract
We propose a new method for transferring a policy from a source task to a target task in model-based reinforcement learning. Our work is motivated by scenarios where a robotic agent operates in similar but challenging environments, such as hospital wards, differentiated by structural arrangements or obstacles, such as furniture. We address problems that require fast responses adapted from incomplete, prior knowledge of the agent in new scenarios. We present an efficient selective exploration strategy that maximally reuses the source task policy. Reuse efficiency is effected through identifying sub-spaces that are different in the target environment, thus limiting the exploration needed in the target task. We empirically show that SEAPoT performs better in terms of jump starts and cumulative average rewards, as compared to existing state-of-the-art policy reuse methods.
Keywords
Transfer learning, policy transfer, reinforcement learning
Discipline
Numerical Analysis and Scientific Computing | Theory and Algorithms
Research Areas
Intelligent Systems and Optimization
Publication
AAAI '17: Proceedings of the 31st Conference on Artificial Intelligence, San Francisco, CA, USA, 2017 February 4-9
First Page
4975
Last Page
4976
Publisher
IFAAMAS
City or Country
Ann Arbor, MI
Citation
NARAYAN, Akshay; LI, Zhuoru; and LEONG, Tze-Yun.
SEAPoT-RL: Selective exploration algorithm for policy transfer in RL. (2017). AAAI '17: Proceedings of the 31st Conference on Artificial Intelligence, San Francisco, CA, USA, 2017 February 4-9. 4975-4976.
Available at: https://ink.library.smu.edu.sg/sis_research/3762
Copyright Owner and License
Publisher
Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.
Additional URL
https://aaai.org/ocs/index.php/AAAI/AAAI17/paper/view/14729