Closed-Loop Supervised Fine-Tuning of Tokenized Traffic Models
This work addresses the problem of realistic traffic simulation for autonomous driving systems, representing an incremental improvement over existing methods.
The paper tackled covariate shift in tokenized traffic simulation models by introducing a closed-loop fine-tuning strategy called CAT-K, which improved a 7M-parameter model to outperform a 102M-parameter model and achieve top performance on the Waymo Sim Agent Challenge.
Traffic simulation aims to learn a policy for traffic agents that, when unrolled in closed-loop, faithfully recovers the joint distribution of trajectories observed in the real world. Inspired by large language models, tokenized multi-agent policies have recently become the state-of-the-art in traffic simulation. However, they are typically trained through open-loop behavior cloning, and thus suffer from covariate shift when executed in closed-loop during simulation. In this work, we present Closest Among Top-K (CAT-K) rollouts, a simple yet effective closed-loop fine-tuning strategy to mitigate covariate shift. CAT-K fine-tuning only requires existing trajectory data, without reinforcement learning or generative adversarial imitation. Concretely, CAT-K fine-tuning enables a small 7M-parameter tokenized traffic simulation policy to outperform a 102M-parameter model from the same model family, achieving the top spot on the Waymo Sim Agent Challenge leaderboard at the time of submission. The code is available at https://github.com/NVlabs/catk.