LGAISep 14, 2023

Goal Space Abstraction in Hierarchical Reinforcement Learning via Set-Based Reachability Analysis

arXiv:2309.07675v27 citationsh-index: 18
Originality Highly original
AI Analysis

This work addresses the challenge of automating symbolic goal abstraction in hierarchical reinforcement learning, which is incremental as it builds on existing feudal HRL methods but introduces a novel approach for goal discovery.

The paper tackles the problem of manually defining symbolic goal representations in hierarchical reinforcement learning by proposing a developmental mechanism that autonomously discovers an emergent goal representation through set-based reachability analysis, resulting in interpretable, transferable, and data-efficient learning on complex navigation tasks.

Open-ended learning benefits immensely from the use of symbolic methods for goal representation as they offer ways to structure knowledge for efficient and transferable learning. However, the existing Hierarchical Reinforcement Learning (HRL) approaches relying on symbolic reasoning are often limited as they require a manual goal representation. The challenge in autonomously discovering a symbolic goal representation is that it must preserve critical information, such as the environment dynamics. In this paper, we propose a developmental mechanism for goal discovery via an emergent representation that abstracts (i.e., groups together) sets of environment states that have similar roles in the task. We introduce a Feudal HRL algorithm that concurrently learns both the goal representation and a hierarchical policy. The algorithm uses symbolic reachability analysis for neural networks to approximate the transition relation among sets of states and to refine the goal representation. We evaluate our approach on complex navigation tasks, showing the learned representation is interpretable, transferrable and results in data efficient learning.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes