Publication Type
Conference Proceeding Article
Version
publishedVersion
Publication Date
5-2020
Abstract
Hierarchical Reinforcement Learning (HRL) is a promising approach to solve more complex tasks which may be challenging for the traditional reinforcement learning. HRL achieves this by decomposing a task into shorter-horizon subgoals which are simpler to achieve. Autonomous discovery of such subgoals is an important part of HRL. Recently, end-to-end HRL methods have been used to reduce the overhead from offline subgoal discovery by seeking the useful subgoals while simultaneously learning optimal policies in a hierarchy. However, these methods may still suffer from slow learning when the search space used by a high level policy to find the subgoals is large. We propose LIDOSS, an end-to-end HRL method with an integrated heuristic for subgoal discovery. In LIDOSS, the search space of a high level policy can be reduced by focusing only on the subgoal states that have high saliency. We evaluate LIDOSS on continuous control tasks in the MuJoCo Ant domain. The results show that LIDOSS outperforms Hierarchical Actor Critic (HAC), a state-of-the-art HRL method, in the fixed goal tasks.
Keywords
Hierarchical Reinforcement Learning, Reinforcement Learning, Subgoal discovery
Discipline
Databases and Information Systems
Research Areas
Data Science and Engineering
Publication
Proceedings 19th International Conference on Autonomous Agents and Multiagent Systems, AAMAS 2020, Auckland, New Zealand, May 9-13
First Page
1963
Last Page
1965
ISBN
9781450375184
Identifier
10.5555/3398761.3399042
Publisher
International Foundation for Autonomous Agents and Multiagent Systems (IFAAMAS)
City or Country
Virtual, Auckland
Citation
1
Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.