LGAINEROMLMay 21, 2018

Hierarchical Reinforcement Learning with Hindsight

arXiv:1805.08180v297 citations
Originality Incremental advance
AI Analysis

This addresses a key bottleneck in reinforcement learning for AI systems, though it appears incremental as it builds on existing techniques like universal value functions and hindsight learning.

The paper tackles the problem of poor sample efficiency in reinforcement learning when rewards are delayed and sparse by introducing a method that learns temporally extended actions at multiple abstraction levels, showing significant acceleration in learning across various discrete and continuous tasks.

Reinforcement Learning (RL) algorithms can suffer from poor sample efficiency when rewards are delayed and sparse. We introduce a solution that enables agents to learn temporally extended actions at multiple levels of abstraction in a sample efficient and automated fashion. Our approach combines universal value functions and hindsight learning, allowing agents to learn policies belonging to different time scales in parallel. We show that our method significantly accelerates learning in a variety of discrete and continuous tasks.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes