RO AISep 23, 2024

Whole-Body Teleoperation for Mobile Manipulation at Zero Added Cost

Daniel Honerkamp, Harsh Mahesheka, Jan Ole von Hartz, Tim Welschehold, Abhinav Valada

arXiv:2409.15095v210.412 citationsh-index: 16

Originality Highly original

AI Analysis

This addresses the challenge of collecting demonstration data for mobile manipulators, which is costly and time-intensive, by providing a low-cost teleoperation solution that improves efficiency and transferability.

The authors tackled the problem of cumbersome data collection for mobile manipulators by developing MoMa-Teleop, a teleoperation method that infers end-effector motions from standard interfaces and delegates base motions to a reinforcement learning agent, resulting in a significant reduction in task completion time and enabling efficient imitation learning from as little as five demonstrations.

Demonstration data plays a key role in learning complex behaviors and training robotic foundation models. While effective control interfaces exist for static manipulators, data collection remains cumbersome and time intensive for mobile manipulators due to their large number of degrees of freedom. While specialized hardware, avatars, or motion tracking can enable whole-body control, these approaches are either expensive, robot-specific, or suffer from the embodiment mismatch between robot and human demonstrator. In this work, we present MoMa-Teleop, a novel teleoperation method that infers end-effector motions from existing interfaces and delegates the base motions to a previously developed reinforcement learning agent, leaving the operator to focus fully on the task-relevant end-effector motions. This enables whole-body teleoperation of mobile manipulators with no additional hardware or setup costs via standard interfaces such as joysticks or hand guidance. Moreover, the operator is not bound to a tracked workspace and can move freely with the robot over spatially extended tasks. We demonstrate that our approach results in a significant reduction in task completion time across a variety of robots and tasks. As the generated data covers diverse whole-body motions without embodiment mismatch, it enables efficient imitation learning. By focusing on task-specific end-effector motions, our approach learns skills that transfer to unseen settings, such as new obstacles or changed object positions, from as little as five demonstrations. We make code and videos available at https://moma-teleop.cs.uni-freiburg.de.

View on arXiv PDF

Similar