LGDec 24, 2022

Understanding the Complexity Gains of Single-Task RL with a Curriculum

arXiv:2212.12809v322 citationsh-index: 166
Originality Incremental advance
AI Analysis

This work addresses the problem of inefficient RL in robotics and AI by providing a curriculum-based method, though it is incremental as it builds on existing multi-task RL frameworks.

The paper tackles the challenge of reinforcement learning (RL) with poorly shaped rewards by reformulating single-task RL as a multi-task problem with a curriculum, showing that this approach is more computationally efficient than solving the original task from scratch without explicit exploration strategies. It also translates these theoretical insights into a practical algorithm that accelerates curriculum learning on simulated robotic tasks.

Reinforcement learning (RL) problems can be challenging without well-shaped rewards. Prior work on provably efficient RL methods generally proposes to address this issue with dedicated exploration strategies. However, another way to tackle this challenge is to reformulate it as a multi-task RL problem, where the task space contains not only the challenging task of interest but also easier tasks that implicitly function as a curriculum. Such a reformulation opens up the possibility of running existing multi-task RL methods as a more efficient alternative to solving a single challenging task from scratch. In this work, we provide a theoretical framework that reformulates a single-task RL problem as a multi-task RL problem defined by a curriculum. Under mild regularity conditions on the curriculum, we show that sequentially solving each task in the multi-task RL problem is more computationally efficient than solving the original single-task problem, without any explicit exploration bonuses or other exploration strategies. We also show that our theoretical insights can be translated into an effective practical learning algorithm that can accelerate curriculum learning on simulated robotic tasks.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes