LG AI MLMay 15, 2019

Meta reinforcement learning as task inference

Jan Humplik, Alexandre Galashov, Leonard Hasenclever, Pedro A. Ortega, Yee Whye Teh, Nicolas Heess

arXiv:1905.06424v226.7141 citations

Originality Incremental advance

AI Analysis

This addresses the challenge of sample-efficient learning in RL for AI systems, though it is incremental as it builds on existing meta-RL frameworks.

The paper tackles the problem of efficient reinforcement learning by using meta-learning to infer hidden task information, proposing a method that separately learns policy and task belief with privileged information, achieving effectiveness in standard meta-RL environments and a complex continuous control setting.

Humans achieve efficient learning by relying on prior knowledge about the structure of naturally occurring tasks. There is considerable interest in designing reinforcement learning (RL) algorithms with similar properties. This includes proposals to learn the learning algorithm itself, an idea also known as meta learning. One formal interpretation of this idea is as a partially observable multi-task RL problem in which task information is hidden from the agent. Such unknown task problems can be reduced to Markov decision processes (MDPs) by augmenting an agent's observations with an estimate of the belief about the task based on past experience. However estimating the belief state is intractable in most partially-observed MDPs. We propose a method that separately learns the policy and the task belief by taking advantage of various kinds of privileged information. Our approach can be very effective at solving standard meta-RL environments, as well as a complex continuous control environment with sparse rewards and requiring long-term memory.

View on arXiv PDF

Similar