ROAILGFeb 14, 2025

Video2Policy: Scaling up Manipulation Tasks in Simulation through Internet Videos

arXiv:2502.09886v124 citationsh-index: 13
Originality Highly original
AI Analysis

This work addresses the challenge of generating diverse and realistic tasks for training generalist policies in robotics, which is significant for robotics researchers and engineers seeking to improve the capabilities of robots in real-world environments.

The authors tackled the problem of scaling up manipulation tasks in simulation by leveraging internet RGB videos, resulting in the successful training of RL policies on over 100 diverse and complex tasks. The approach enabled the generation of simulation data that can be scaled up for training a general policy and transferred back to a real robot.

Simulation offers a promising approach for cheaply scaling training data for generalist policies. To scalably generate data from diverse and realistic tasks, existing algorithms either rely on large language models (LLMs) that may hallucinate tasks not interesting for robotics; or digital twins, which require careful real-to-sim alignment and are hard to scale. To address these challenges, we introduce Video2Policy, a novel framework that leverages internet RGB videos to reconstruct tasks based on everyday human behavior. Our approach comprises two phases: (1) task generation in simulation from videos; and (2) reinforcement learning utilizing in-context LLM-generated reward functions iteratively. We demonstrate the efficacy of Video2Policy by reconstructing over 100 videos from the Something-Something-v2 (SSv2) dataset, which depicts diverse and complex human behaviors on 9 different tasks. Our method can successfully train RL policies on such tasks, including complex and challenging tasks such as throwing. Finally, we show that the generated simulation data can be scaled up for training a general policy, and it can be transferred back to the real robot in a Real2Sim2Real way.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes