LGAIJun 29, 2021

Globally Optimal Hierarchical Reinforcement Learning for Linearly-Solvable Markov Decision Processes

arXiv:2106.15380v39 citations
Originality Incremental advance
AI Analysis

This addresses sample efficiency in hierarchical reinforcement learning for researchers, though it is incremental as it builds on existing linearly-solvable frameworks.

The paper tackles the problem of hierarchical reinforcement learning for linearly-solvable Markov decision processes by partitioning the state space and using abstraction levels to estimate optimal values, enabling learning of globally optimal policies with reduced non-stationarity. It shows significantly smaller sample complexity than flat learners when boundary states are limited, validated empirically.

In this work we present a novel approach to hierarchical reinforcement learning for linearly-solvable Markov decision processes. Our approach assumes that the state space is partitioned, and the subtasks consist in moving between the partitions. We represent value functions on several levels of abstraction, and use the compositionality of subtasks to estimate the optimal values of the states in each partition. The policy is implicitly defined on these optimal value estimates, rather than being decomposed among the subtasks. As a consequence, our approach can learn the globally optimal policy, and does not suffer from the non-stationarity of high-level decisions. If several partitions have equivalent dynamics, the subtasks of those partitions can be shared. If the set of boundary states is smaller than the entire state space, our approach can have significantly smaller sample complexity than that of a flat learner, and we validate this empirically in several experiments.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes