LGAINEMLOct 9, 2019

Improving Generalization in Meta Reinforcement Learning using Learned Objectives

arXiv:1910.04098v2132 citations
AI Analysis

This addresses the challenge of poor generalization in meta-RL for AI researchers, offering a potentially more adaptable and efficient approach.

The paper tackles the problem of generalization in meta reinforcement learning by introducing MetaGenRL, a novel algorithm that meta-learns a neural objective function from the experiences of multiple agents, enabling it to generalize to entirely new environments and sometimes outperform human-engineered RL algorithms.

Biological evolution has distilled the experiences of many learners into the general learning algorithms of humans. Our novel meta reinforcement learning algorithm MetaGenRL is inspired by this process. MetaGenRL distills the experiences of many complex agents to meta-learn a low-complexity neural objective function that decides how future individuals will learn. Unlike recent meta-RL algorithms, MetaGenRL can generalize to new environments that are entirely different from those used for meta-training. In some cases, it even outperforms human-engineered RL algorithms. MetaGenRL uses off-policy second-order gradients during meta-training that greatly increase its sample efficiency.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes