Value-based subgoal discovery and path planning for reaching long-horizon goals

Publication Type

Journal Article

Publication Date

2-2023

Abstract

Learning to reach long-horizon goals in spatial traversal tasks is a significant challenge for autonomous agents. Recent subgoal graph-based planning methods address this challenge by decomposing a goal into a sequence of shorter-horizon subgoals. These methods, however, use arbitrary heuristics for sampling or discovering subgoals, which may not conform to the cumulative reward distribution. Moreover, they are prone to learning erroneous connections (edges) between subgoals, especially those lying across obstacles. To address these issues, this article proposes a novel subgoal graph-based planning method called learning subgoal graph using value-based subgoal discovery and automatic pruning (LSGVP). The proposed method uses a subgoal discovery heuristic that is based on a cumulative reward (value) measure and yields sparse subgoals, including those lying on the higher cumulative reward paths. Moreover, LSGVP guides the agent to automatically prune the learned subgoal graph to remove the erroneous edges. The combination of these novel features helps the LSGVP agent to achieve higher cumulative positive rewards than other subgoal sampling or discovery heuristics, as well as higher goal-reaching success rates than other state-of-the-art subgoal graph-based planning methods.

Keywords

Long-horizon goal-reaching, motion planning, path planning, reinforcement learning (RL), subgoal discovery, subgoal graph

Discipline

Databases and Information Systems | OS and Networks

Research Areas

Data Science and Engineering

Publication

IEEE Transactions on Neural Networks and Learning Systems

First Page

1

Last Page

13

ISSN

2162-237X

Identifier

10.1109/TNNLS.2023.3240004

Publisher

Institute of Electrical and Electronics Engineers

Additional URL

https://doi.org/10.1109/TNNLS.2023.3240004

This document is currently not available here.

Share

COinS