LGMLOct 7, 2019

Self-Paced Contextual Reinforcement Learning

arXiv:1910.02826v154 citationsHas Code
Originality Incremental advance
AI Analysis

This work addresses generalization and adaptation for autonomous robots, presenting an incremental improvement over existing contextual reinforcement learning methods.

The paper tackles the problem of inefficient sampling in contextual reinforcement learning by introducing a curriculum learning scheme that allows agents to control intermediate task distributions, resulting in drastically improved sample efficiency and enabling learning in scenarios where classical approaches perform sub-optimally.

Generalization and adaptation of learned skills to novel situations is a core requirement for intelligent autonomous robots. Although contextual reinforcement learning provides a principled framework for learning and generalization of behaviors across related tasks, it generally relies on uninformed sampling of environments from an unknown, uncontrolled context distribution, thus missing the benefits of structured, sequential learning. We introduce a novel relative entropy reinforcement learning algorithm that gives the agent the freedom to control the intermediate task distribution, allowing for its gradual progression towards the target context distribution. Empirical evaluation shows that the proposed curriculum learning scheme drastically improves sample efficiency and enables learning in scenarios with both broad and sharp target context distributions in which classical approaches perform sub-optimally.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes