LG AI ROMay 17, 2017

Automatic Goal Generation for Reinforcement Learning Agents

Carlos Florensa, David Held, Xinyang Geng, Pieter Abbeel

arXiv:1705.06366v532.5572 citations

Originality Incremental advance

AI Analysis

This addresses the scalability issue in reinforcement learning for multi-task settings, such as navigation or object manipulation, by enabling automatic task discovery, though it is an incremental improvement over existing curriculum learning methods.

The paper tackles the problem of training reinforcement learning agents to perform diverse tasks without predefined reward functions by proposing an automatic goal generation method using adversarial training to create a curriculum of tasks at appropriate difficulty levels. The result is an agent that efficiently learns a wide set of tasks, including those with sparse rewards, without prior environmental knowledge.

Reinforcement learning is a powerful technique to train an agent to perform a task. However, an agent that is trained using reinforcement learning is only capable of achieving the single task that is specified via its reward function. Such an approach does not scale well to settings in which an agent needs to perform a diverse set of tasks, such as navigating to varying positions in a room or moving objects to varying locations. Instead, we propose a method that allows an agent to automatically discover the range of tasks that it is capable of performing. We use a generator network to propose tasks for the agent to try to achieve, specified as goal states. The generator network is optimized using adversarial training to produce tasks that are always at the appropriate level of difficulty for the agent. Our method thus automatically produces a curriculum of tasks for the agent to learn. We show that, by using this framework, an agent can efficiently and automatically learn to perform a wide set of tasks without requiring any prior knowledge of its environment. Our method can also learn to achieve tasks with sparse rewards, which traditionally pose significant challenges.

View on arXiv PDF

Similar