Goal-Oriented Reactive Simulation for Closed-Loop Trajectory Prediction
This work addresses the issue of improved safety and interactivity in autonomous driving systems by enhancing collision avoidance through reactive simulation, though it is incremental as it builds on existing trajectory prediction methods.
The paper tackles the problem of trajectory prediction models suffering from covariate shift and errors in real-world closed-loop settings by proposing a closed-loop training paradigm with a goal-oriented transformer-based scene decoder, resulting in up to 27.0% and 79.5% reductions in collision rates on nuScenes and DeepScenario datasets compared to open-loop baselines.
Current trajectory prediction models are primarily trained in an open-loop manner, which often leads to covariate shift and compounding errors when deployed in real-world, closed-loop settings. Furthermore, relying on static datasets or non-reactive log-replay simulators severs the interactive loop, preventing the ego agent from learning to actively negotiate surrounding traffic. In this work, we propose an on-policy closed-loop training paradigm optimized for high-frequency, receding horizon ego prediction. To ground the ego prediction in a realistic representation of traffic interactions and to achieve reactive consistency, we introduce a goal-oriented, transformer-based scene decoder, resulting in an inherently reactive training simulation. By exposing the ego agent to a mixture of open-loop data and simulated, self-induced states, the model learns recovery behaviors to correct its own execution errors. Extensive evaluation demonstrates that closed-loop training significantly enhances collision avoidance capabilities at high replanning frequencies, yielding relative collision rate reductions of up to 27.0% on nuScenes and 79.5% in dense DeepScenario intersections compared to open-loop baselines. Additionally, we show that a hybrid simulation combining reactive with non-reactive surrounding agents achieves optimal balance between immediate interactivity and long-term behavioral stability.