ARCap: Collecting High-quality Human Demonstrations for Robot Learning with Augmented Reality Feedback
This addresses the issue of scaling up training datasets for imitation learning in robotics, particularly for novice users, by improving data quality without physical robot hardware, though it is incremental as it builds on existing portable data collection methods.
The paper tackles the problem of collecting high-quality human demonstrations for robot learning by proposing ARCap, a portable system using augmented reality and haptic feedback to guide users, resulting in novice users collecting robot-executable data that enables robots to perform challenging tasks like manipulation in cluttered environments.
Recent progress in imitation learning from human demonstrations has shown promising results in teaching robots manipulation skills. To further scale up training datasets, recent works start to use portable data collection devices without the need for physical robot hardware. However, due to the absence of on-robot feedback during data collection, the data quality depends heavily on user expertise, and many devices are limited to specific robot embodiments. We propose ARCap, a portable data collection system that provides visual feedback through augmented reality (AR) and haptic warnings to guide users in collecting high-quality demonstrations. Through extensive user studies, we show that ARCap enables novice users to collect robot-executable data that matches robot kinematics and avoids collisions with the scenes. With data collected from ARCap, robots can perform challenging tasks, such as manipulation in cluttered environments and long-horizon cross-embodiment manipulation. ARCap is fully open-source and easy to calibrate; all components are built from off-the-shelf products. More details and results can be found on our website: https://stanford-tml.github.io/ARCap