Probabilistic Curriculum Learning for Goal-Based Reinforcement Learning
This addresses the problem of automating goal creation in reinforcement learning for researchers and practitioners, though it appears incremental as it builds on existing curriculum learning methods.
The paper tackles the challenge of automating goal creation in reinforcement learning by introducing a probabilistic curriculum learning algorithm, which suggests goals for agents in continuous control and navigation tasks.
Reinforcement learning (RL) -- algorithms that teach artificial agents to interact with environments by maximising reward signals -- has achieved significant success in recent years. These successes have been facilitated by advances in algorithms (e.g., deep Q-learning, deep deterministic policy gradients, proximal policy optimisation, trust region policy optimisation, and soft actor-critic) and specialised computational resources such as GPUs and TPUs. One promising research direction involves introducing goals to allow multimodal policies, commonly through hierarchical or curriculum reinforcement learning. These methods systematically decompose complex behaviours into simpler sub-tasks, analogous to how humans progressively learn skills (e.g. we learn to run before we walk, or we learn arithmetic before calculus). However, fully automating goal creation remains an open challenge. We present a novel probabilistic curriculum learning algorithm to suggest goals for reinforcement learning agents in continuous control and navigation tasks.