Joint Learning of Hierarchical Neural Options and Abstract World Model
This addresses the challenge of sample efficiency in hierarchical reinforcement learning for AI agents, though it appears incremental as it builds on existing model-free methods.
The paper tackles the problem of efficiently acquiring a sequence of skills for AI agents by proposing AgentOWL, which jointly learns an abstract world model and hierarchical neural options, resulting in learning more skills with much less data on Object-Centric Atari games.
Building agents that can perform new skills by composing existing skills is a long-standing goal of AI agent research. Towards this end, we investigate how to efficiently acquire a sequence of skills, formalized as hierarchical neural options. However, existing model-free hierarchical reinforcement algorithms need a lot of data. We propose a novel method, which we call AgentOWL (Option and World model Learning Agent), that jointly learns -- in a sample efficient way -- an abstract world model (abstracting across both states and time) and a set of hierarchical neural options. We show, on a subset of Object-Centric Atari games, that our method can learn more skills using much less data than baseline methods.