LGAINEROFeb 20, 2023

Meta-World Conditional Neural Processes

arXiv:2302.10320v12 citationsh-index: 25
Originality Incremental advance
AI Analysis

This addresses sample efficiency in meta-reinforcement learning for agents adapting to new tasks, though it appears incremental as it builds on existing Conditional Neural Processes and meta-learning frameworks.

The paper tackles the problem of reducing an agent's interaction with unseen target environments in meta-reinforcement learning by proposing Meta-World Conditional Neural Processes (MW-CNP), which enables the agent to adapt with significantly fewer samples from the target environment compared to baselines.

We propose Meta-World Conditional Neural Processes (MW-CNP), a conditional world model generator that leverages sample efficiency and scalability of Conditional Neural Processes to enable an agent to sample from its own "hallucination". We intend to reduce the agent's interaction with the target environment at test time as much as possible. To reduce the number of samples required at test time, we first obtain a latent representation of the transition dynamics from a single rollout from the test environment with hidden parameters. Then, we obtain rollouts for few-shot learning by interacting with the "hallucination" generated by the meta-world model. Using the world model representation from MW-CNP, the meta-RL agent can adapt to an unseen target environment with significantly fewer samples collected from the target environment compared to the baselines. We emphasize that the agent does not have access to the task parameters throughout training and testing, and MW-CNP is trained on offline interaction data logged during meta-training.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes