ROAILGMay 16, 2024

TRANSIC: Sim-to-Real Policy Transfer by Learning from Online Correction

Stanford
arXiv:2405.10315v370 citationsh-index: 10CoRL
Originality Incremental advance
AI Analysis

This work addresses the sim-to-real gap for robotics, enabling more generalist robots, but it is incremental as it builds on existing human-in-the-loop methods.

The paper tackles the problem of transferring policies learned in simulation to real robots by addressing simulation-to-reality gaps through human-in-the-loop corrections, achieving successful transfer in complex manipulation tasks like furniture assembly.

Learning in simulation and transferring the learned policy to the real world has the potential to enable generalist robots. The key challenge of this approach is to address simulation-to-reality (sim-to-real) gaps. Previous methods often require domain-specific knowledge a priori. We argue that a straightforward way to obtain such knowledge is by asking humans to observe and assist robot policy execution in the real world. The robots can then learn from humans to close various sim-to-real gaps. We propose TRANSIC, a data-driven approach to enable successful sim-to-real transfer based on a human-in-the-loop framework. TRANSIC allows humans to augment simulation policies to overcome various unmodeled sim-to-real gaps holistically through intervention and online correction. Residual policies can be learned from human corrections and integrated with simulation policies for autonomous execution. We show that our approach can achieve successful sim-to-real transfer in complex and contact-rich manipulation tasks such as furniture assembly. Through synergistic integration of policies learned in simulation and from humans, TRANSIC is effective as a holistic approach to addressing various, often coexisting sim-to-real gaps. It displays attractive properties such as scaling with human effort. Videos and code are available at https://transic-robot.github.io/

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes