Open-TeleVision: Teleoperation with Immersive Active Visual Feedback
This addresses the need for intuitive teleoperation to improve data collection for robot learning, though it appears incremental as it builds on existing teleoperation concepts with enhanced immersion.
The paper tackled the problem of collecting high-quality robot demonstration data by proposing Open-TeleVision, an immersive teleoperation system that uses stereoscopic visual feedback and mirrors operator movements, resulting in successful imitation learning policies for four long-horizon tasks on two humanoid robots deployed in the real world.
Teleoperation serves as a powerful method for collecting on-robot data essential for robot learning from demonstrations. The intuitiveness and ease of use of the teleoperation system are crucial for ensuring high-quality, diverse, and scalable data. To achieve this, we propose an immersive teleoperation system Open-TeleVision that allows operators to actively perceive the robot's surroundings in a stereoscopic manner. Additionally, the system mirrors the operator's arm and hand movements on the robot, creating an immersive experience as if the operator's mind is transmitted to a robot embodiment. We validate the effectiveness of our system by collecting data and training imitation learning policies on four long-horizon, precise tasks (Can Sorting, Can Insertion, Folding, and Unloading) for 2 different humanoid robots and deploy them in the real world. The system is open-sourced at: https://robot-tv.github.io/