LGAIFeb 27, 2024

Unsupervised Zero-Shot Reinforcement Learning via Functional Reward Encodings

arXiv:2402.17135v129 citationsh-index: 16Has CodeICML
Originality Highly original
AI Analysis

This addresses the challenge of zero-shot adaptation in reinforcement learning for robotics, offering a scalable solution with demonstrated improvements.

The paper tackles the problem of pre-training a generalist agent from unlabeled offline trajectories to adapt zero-shot to new tasks, achieving this by learning functional reward encodings that enable the agent to outperform previous zero-shot and offline RL methods on simulated robotic benchmarks.

Can we pre-train a generalist agent from a large amount of unlabeled offline trajectories such that it can be immediately adapted to any new downstream tasks in a zero-shot manner? In this work, we present a functional reward encoding (FRE) as a general, scalable solution to this zero-shot RL problem. Our main idea is to learn functional representations of any arbitrary tasks by encoding their state-reward samples using a transformer-based variational auto-encoder. This functional encoding not only enables the pre-training of an agent from a wide diversity of general unsupervised reward functions, but also provides a way to solve any new downstream tasks in a zero-shot manner, given a small number of reward-annotated samples. We empirically show that FRE agents trained on diverse random unsupervised reward functions can generalize to solve novel tasks in a range of simulated robotic benchmarks, often outperforming previous zero-shot RL and offline RL methods. Code for this project is provided at: https://github.com/kvfrans/fre

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes