AIROOct 11, 2023

RoboCLIP: One Demonstration is Enough to Learn Robot Policies

arXiv:2310.07899v1144 citationsh-index: 66
Originality Highly original
AI Analysis

This addresses the challenge of reducing expert supervision and data needs for robot policy learning, offering a more efficient approach for robotics applications.

The paper tackled the problem of reward specification and large data requirements in reinforcement learning by introducing RoboCLIP, an online imitation learning method that uses a single video or text demonstration to generate rewards without manual design, achieving 2-3 times higher zero-shot performance than competing methods on robot manipulation tasks.

Reward specification is a notoriously difficult problem in reinforcement learning, requiring extensive expert supervision to design robust reward functions. Imitation learning (IL) methods attempt to circumvent these problems by utilizing expert demonstrations but typically require a large number of in-domain expert demonstrations. Inspired by advances in the field of Video-and-Language Models (VLMs), we present RoboCLIP, an online imitation learning method that uses a single demonstration (overcoming the large data requirement) in the form of a video demonstration or a textual description of the task to generate rewards without manual reward function design. Additionally, RoboCLIP can also utilize out-of-domain demonstrations, like videos of humans solving the task for reward generation, circumventing the need to have the same demonstration and deployment domains. RoboCLIP utilizes pretrained VLMs without any finetuning for reward generation. Reinforcement learning agents trained with RoboCLIP rewards demonstrate 2-3 times higher zero-shot performance than competing imitation learning methods on downstream robot manipulation tasks, doing so using only one video/text demonstration.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes