LG AI ROFeb 5, 2025

Robust Autonomy Emerges from Self-Play

Marco Cusumano-Towner, David Hafner, Alex Hertzberg, Brody Huval, Aleksei Petrenko, Eugene Vinitsky, Erik Wijmans, Taylor Killian, Stuart Bowers, Ozan Sener, Philipp Krähenbühl, Vladlen Koltun

arXiv:2502.03349v134.859 citationsh-index: 113ICML

Originality Highly original

AI Analysis

This addresses robust autonomous driving for real-world applications, representing a significant advance rather than an incremental improvement.

The paper tackled the problem of autonomous driving by training a policy entirely through self-play in simulation, achieving state-of-the-art performance on benchmarks and averaging 17.5 years of continuous driving between incidents in simulation.

Self-play has powered breakthroughs in two-player and multi-player games. Here we show that self-play is a surprisingly effective strategy in another domain. We show that robust and naturalistic driving emerges entirely from self-play in simulation at unprecedented scale -- 1.6~billion~km of driving. This is enabled by Gigaflow, a batched simulator that can synthesize and train on 42 years of subjective driving experience per hour on a single 8-GPU node. The resulting policy achieves state-of-the-art performance on three independent autonomous driving benchmarks. The policy outperforms the prior state of the art when tested on recorded real-world scenarios, amidst human drivers, without ever seeing human data during training. The policy is realistic when assessed against human references and achieves unprecedented robustness, averaging 17.5 years of continuous driving between incidents in simulation.

View on arXiv PDF

Similar