LGOct 26, 2021

Landmark-Guided Subgoal Generation in Hierarchical Reinforcement Learning

arXiv:2110.13625v379 citations
Originality Incremental advance
AI Analysis

This addresses inefficiency in training for complex and long-horizon RL tasks, representing an incremental improvement in hierarchical reinforcement learning.

The paper tackles the problem of poor exploration in goal-conditioned hierarchical reinforcement learning due to large action spaces by introducing a framework that uses landmarks to guide subgoal generation, resulting in outperforming prior methods across various control tasks.

Goal-conditioned hierarchical reinforcement learning (HRL) has shown promising results for solving complex and long-horizon RL tasks. However, the action space of high-level policy in the goal-conditioned HRL is often large, so it results in poor exploration, leading to inefficiency in training. In this paper, we present HIerarchical reinforcement learning Guided by Landmarks (HIGL), a novel framework for training a high-level policy with a reduced action space guided by landmarks, i.e., promising states to explore. The key component of HIGL is twofold: (a) sampling landmarks that are informative for exploration and (b) encouraging the high-level policy to generate a subgoal towards a selected landmark. For (a), we consider two criteria: coverage of the entire visited state space (i.e., dispersion of states) and novelty of states (i.e., prediction error of a state). For (b), we select a landmark as the very first landmark in the shortest path in a graph whose nodes are landmarks. Our experiments demonstrate that our framework outperforms prior-arts across a variety of control tasks, thanks to efficient exploration guided by landmarks.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes