ROAICVLGDec 1, 2025

RoaD: Rollouts as Demonstrations for Closed-Loop Supervised Fine-Tuning of Autonomous Driving Policies

arXiv:2512.01993v13 citationsh-index: 31
Originality Incremental advance
AI Analysis

This addresses the issue of compounding errors in autonomous driving for safer and more robust policies, representing an incremental improvement over prior closed-loop supervised fine-tuning methods.

The paper tackles the problem of covariate shift in autonomous driving policies trained via open-loop behavior cloning by introducing RoaD, a method that uses the policy's own closed-loop rollouts as training data with expert guidance, resulting in a 41% improvement in driving score and 54% reduction in collisions in a high-fidelity simulator.

Autonomous driving policies are typically trained via open-loop behavior cloning of human demonstrations. However, such policies suffer from covariate shift when deployed in closed loop, leading to compounding errors. We introduce Rollouts as Demonstrations (RoaD), a simple and efficient method to mitigate covariate shift by leveraging the policy's own closed-loop rollouts as additional training data. During rollout generation, RoaD incorporates expert guidance to bias trajectories toward high-quality behavior, producing informative yet realistic demonstrations for fine-tuning. This approach enables robust closed-loop adaptation with orders of magnitude less data than reinforcement learning, and avoids restrictive assumptions of prior closed-loop supervised fine-tuning (CL-SFT) methods, allowing broader applications domains including end-to-end driving. We demonstrate the effectiveness of RoaD on WOSAC, a large-scale traffic simulation benchmark, where it performs similar or better than the prior CL-SFT method; and in AlpaSim, a high-fidelity neural reconstruction-based simulator for end-to-end driving, where it improves driving score by 41\% and reduces collisions by 54\%.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes