RO AIMar 21, 2024

Learning Quadruped Locomotion Using Differentiable Simulation

Yunlong Song, Sangbae Kim, Davide Scaramuzza

arXiv:2403.14864v422.741 citationsh-index: 18CoRL

Originality Incremental advance

AI Analysis

This addresses the challenge of efficient and stable training for legged robots, offering a novel alternative to traditional reinforcement learning methods, though it is incremental in applying differentiable simulation to real-world quadruped tasks.

This work tackled the problem of learning quadruped locomotion by proposing a differentiable simulation framework that combines high-fidelity simulation with a simplified surrogate model for gradient computation, enabling the robot to master diverse locomotion skills on challenging terrains in minutes and outperforming reinforcement learning methods like PPO in sample efficiency.

This work explores the potential of using differentiable simulation for learning quadruped locomotion. Differentiable simulation promises fast convergence and stable training by computing low-variance first-order gradients using robot dynamics. However, its usage for legged robots is still limited to simulation. The main challenge lies in the complex optimization landscape of robotic tasks due to discontinuous dynamics. This work proposes a new differentiable simulation framework to overcome these challenges. Our approach combines a high-fidelity, non-differentiable simulator for forward dynamics with a simplified surrogate model for gradient backpropagation. This approach maintains simulation accuracy by aligning the robot states from the surrogate model with those of the precise, non-differentiable simulator. Our framework enables learning quadruped walking in simulation in minutes without parallelization. When augmented with GPU parallelization, our approach allows the quadruped robot to master diverse locomotion skills on challenging terrains in minutes. We demonstrate that differentiable simulation outperforms a reinforcement learning algorithm (PPO) by achieving significantly better sample efficiency while maintaining its effectiveness in handling large-scale environments. Our method represents one of the first successful applications of differentiable simulation to real-world quadruped locomotion, offering a compelling alternative to traditional RL methods.

View on arXiv PDF

Similar