ROCVOct 31, 2024

EgoMimic: Scaling Imitation Learning via Egocentric Video

arXiv:2410.24221v1160 citationsh-index: 6ICRA
Originality Incremental advance
AI Analysis

This addresses the problem of data scarcity in imitation learning for robotics, enabling better generalization to new scenes, though it appears incremental by combining existing techniques with new data sources.

The paper tackles the challenge of scaling imitation learning for manipulation tasks by introducing EgoMimic, a framework that uses egocentric human videos with 3D hand tracking as demonstration data, achieving significant improvements over state-of-the-art methods on diverse long-horizon tasks and showing that adding human data is more valuable than robot data.

The scale and diversity of demonstration data required for imitation learning is a significant challenge. We present EgoMimic, a full-stack framework which scales manipulation via human embodiment data, specifically egocentric human videos paired with 3D hand tracking. EgoMimic achieves this through: (1) a system to capture human embodiment data using the ergonomic Project Aria glasses, (2) a low-cost bimanual manipulator that minimizes the kinematic gap to human data, (3) cross-domain data alignment techniques, and (4) an imitation learning architecture that co-trains on human and robot data. Compared to prior works that only extract high-level intent from human videos, our approach treats human and robot data equally as embodied demonstration data and learns a unified policy from both data sources. EgoMimic achieves significant improvement on a diverse set of long-horizon, single-arm and bimanual manipulation tasks over state-of-the-art imitation learning methods and enables generalization to entirely new scenes. Finally, we show a favorable scaling trend for EgoMimic, where adding 1 hour of additional hand data is significantly more valuable than 1 hour of additional robot data. Videos and additional information can be found at https://egomimic.github.io/

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes