LGOct 27, 2022

Meta-Reinforcement Learning Using Model Parameters

arXiv:2210.15515v1h-index: 25
Originality Incremental advance
AI Analysis

This work addresses the challenge of rapid adaptation in reinforcement learning for agents operating in diverse environments, representing an incremental improvement over existing meta-RL methods.

The paper tackles the problem of meta-reinforcement learning by proposing RAMP, which uses model parameters from a learned dynamic model as context to enable efficient adaptation to new environments, achieving improved sample efficiency and adaptation speed in experiments.

In meta-reinforcement learning, an agent is trained in multiple different environments and attempts to learn a meta-policy that can efficiently adapt to a new environment. This paper presents RAMP, a Reinforcement learning Agent using Model Parameters that utilizes the idea that a neural network trained to predict environment dynamics encapsulates the environment information. RAMP is constructed in two phases: in the first phase, a multi-environment parameterized dynamic model is learned. In the second phase, the model parameters of the dynamic model are used as context for the multi-environment policy of the model-free reinforcement learning agent.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes