Learning High-level Representations from Demonstrations
This addresses the challenge of skill acquisition in hierarchical reinforcement learning for domains with long horizons and sparse rewards, representing an incremental improvement over existing methods.
The paper tackled the problem of identifying reusable skills for hierarchical learning in complex sequential decision tasks by proposing a method that infers subgoals from human demonstrations, decomposing the problem into high-level and low-level representations. The result showed significant outperformance over previous baselines on challenging problems like Montezuma's Revenge and a simulated robotics maze task.
Hierarchical learning (HL) is key to solving complex sequential decision problems with long horizons and sparse rewards. It allows learning agents to break-up large problems into smaller, more manageable subtasks. A common approach to HL, is to provide the agent with a number of high-level skills that solve small parts of the overall problem. A major open question, however, is how to identify a suitable set of reusable skills. We propose a principled approach that uses human demonstrations to infer a set of subgoals based on changes in the demonstration dynamics. Using these subgoals, we decompose the learning problem into an abstract high-level representation and a set of low-level subtasks. The abstract description captures the overall problem structure, while subtasks capture desired skills. We demonstrate that we can jointly optimize over both levels of learning. We show that the resulting method significantly outperforms previous baselines on two challenging problems: the Atari 2600 game Montezuma's Revenge, and a simulated robotics problem moving the ant robot through a maze.