LGAIMLOct 2, 2020

Exploration in Approximate Hyper-State Space for Meta Reinforcement Learning

arXiv:2010.01062v345 citations
Originality Incremental advance
AI Analysis

This addresses the challenge of efficient exploration in meta-RL for agents needing rapid learning in sparse-reward environments, though it appears incremental as it builds on existing hyper-state concepts.

The paper tackles the problem of meta-reinforcement learning failing with sparse rewards by proposing HyperX, which uses novel reward bonuses to explore in approximate hyper-state space, resulting in better task-exploration and more successful adaptation to new tasks compared to existing methods.

To rapidly learn a new task, it is often essential for agents to explore efficiently -- especially when performance matters from the first timestep. One way to learn such behaviour is via meta-learning. Many existing methods however rely on dense rewards for meta-training, and can fail catastrophically if the rewards are sparse. Without a suitable reward signal, the need for exploration during meta-training is exacerbated. To address this, we propose HyperX, which uses novel reward bonuses for meta-training to explore in approximate hyper-state space (where hyper-states represent the environment state and the agent's task belief). We show empirically that HyperX meta-learns better task-exploration and adapts more successfully to new tasks than existing methods.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes