ROLGSep 24, 2021

Learning to Walk in Minutes Using Massively Parallel Deep Reinforcement Learning

arXiv:2109.11978v3947 citationsHas Code
Originality Incremental advance
AI Analysis

This work addresses the challenge of slow training times for real-world robotic locomotion, enabling rapid policy development for legged robots like ANYmal, though it is incremental in improving efficiency through parallelism.

The authors tackled the problem of fast policy generation for robotic walking tasks by using massively parallel deep reinforcement learning on a single GPU, achieving training times of under four minutes for flat terrain and twenty minutes for uneven terrain, representing a speedup of multiple orders of magnitude compared to prior work.

In this work, we present and study a training set-up that achieves fast policy generation for real-world robotic tasks by using massive parallelism on a single workstation GPU. We analyze and discuss the impact of different training algorithm components in the massively parallel regime on the final policy performance and training times. In addition, we present a novel game-inspired curriculum that is well suited for training with thousands of simulated robots in parallel. We evaluate the approach by training the quadrupedal robot ANYmal to walk on challenging terrain. The parallel approach allows training policies for flat terrain in under four minutes, and in twenty minutes for uneven terrain. This represents a speedup of multiple orders of magnitude compared to previous work. Finally, we transfer the policies to the real robot to validate the approach. We open-source our training code to help accelerate further research in the field of learned legged locomotion.

Code Implementations5 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes