ROAILGFeb 15, 2025

Bridging the Sim-to-Real Gap for Athletic Loco-Manipulation

arXiv:2502.10894v123 citationsh-index: 15Robotics
Originality Incremental advance
AI Analysis

This addresses the sim-to-real gap for robotic athletic performance, but it is incremental as it builds on existing methods for reward design and transfer learning.

The paper tackled the problem of training robots for athletic loco-manipulation using task rewards, which are prone to exploitation and lack exploration direction, by proposing a two-stage pipeline with an Unsupervised Actuator Net and pre-training strategy, achieving remarkable fidelity in tasks like lifting, throwing, and dragging from simulation to reality.

Achieving athletic loco-manipulation on robots requires moving beyond traditional tracking rewards - which simply guide the robot along a reference trajectory - to task rewards that drive truly dynamic, goal-oriented behaviors. Commands such as "throw the ball as far as you can" or "lift the weight as quickly as possible" compel the robot to exhibit the agility and power inherent in athletic performance. However, training solely with task rewards introduces two major challenges: these rewards are prone to exploitation (reward hacking), and the exploration process can lack sufficient direction. To address these issues, we propose a two-stage training pipeline. First, we introduce the Unsupervised Actuator Net (UAN), which leverages real-world data to bridge the sim-to-real gap for complex actuation mechanisms without requiring access to torque sensing. UAN mitigates reward hacking by ensuring that the learned behaviors remain robust and transferable. Second, we use a pre-training and fine-tuning strategy that leverages reference trajectories as initial hints to guide exploration. With these innovations, our robot athlete learns to lift, throw, and drag with remarkable fidelity from simulation to reality.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes