RO AI LGDec 1, 2025

Learning Sim-to-Real Humanoid Locomotion in 15 Minutes

Younggyo Seo, Carmelo Sferrazza, Juyue Chen, Guanya Shi, Rocky Duan, Pieter Abbeel

arXiv:2512.01996v116.410 citationsh-index: 18Has Code

Originality Incremental advance

AI Analysis

This work addresses the problem of slow and unstable training for humanoid control in robotics, offering a practical solution for researchers and engineers, though it appears incremental as it builds on existing off-policy RL methods with optimized tuning.

The paper tackles the challenge of fast and reliable sim-to-real reinforcement learning for humanoid locomotion by introducing a simple recipe based on off-policy RL algorithms, achieving rapid training of policies in just 15 minutes with a single GPU and demonstrating successful deployment on real robots under strong domain randomization.

Massively parallel simulation has reduced reinforcement learning (RL) training time for robots from days to minutes. However, achieving fast and reliable sim-to-real RL for humanoid control remains difficult due to the challenges introduced by factors such as high dimensionality and domain randomization. In this work, we introduce a simple and practical recipe based on off-policy RL algorithms, i.e., FastSAC and FastTD3, that enables rapid training of humanoid locomotion policies in just 15 minutes with a single RTX 4090 GPU. Our simple recipe stabilizes off-policy RL algorithms at massive scale with thousands of parallel environments through carefully tuned design choices and minimalist reward functions. We demonstrate rapid end-to-end learning of humanoid locomotion controllers on Unitree G1 and Booster T1 robots under strong domain randomization, e.g., randomized dynamics, rough terrain, and push perturbations, as well as fast training of whole-body human-motion tracking policies. We provide videos and open-source implementation at: https://younggyo.me/fastsac-humanoid.

View on arXiv PDF

Similar