Learning Agile Locomotion via Adversarial Training
This work addresses the problem of reducing human effort in designing training environments for agile locomotion in legged robots, representing an incremental improvement over prior methods.
The paper tackles the challenge of automating agile locomotion controller design for legged robots by introducing a multi-agent adversarial training system where a quadruped robot chases an adversary, reducing the need for manual environment design. The result shows that the learned controller significantly outperforms carefully designed baselines, with the use of an ensemble of adversaries being essential for mastering agility.
Developing controllers for agile locomotion is a long-standing challenge for legged robots. Reinforcement learning (RL) and Evolution Strategy (ES) hold the promise of automating the design process of such controllers. However, dedicated and careful human effort is required to design training environments to promote agility. In this paper, we present a multi-agent learning system, in which a quadruped robot (protagonist) learns to chase another robot (adversary) while the latter learns to escape. We find that this adversarial training process not only encourages agile behaviors but also effectively alleviates the laborious environment design effort. In contrast to prior works that used only one adversary, we find that training an ensemble of adversaries, each of which specializes in a different escaping strategy, is essential for the protagonist to master agility. Through extensive experiments, we show that the locomotion controller learned with adversarial training significantly outperforms carefully designed baselines.