Hidden Biases of End-to-End Driving Models
This addresses performance inflation issues in autonomous driving research, revealing that current benchmarks may not reflect real-world capabilities.
The paper identified two systematic biases in end-to-end driving models that artificially inflate performance on CARLA benchmarks, and proposed principled alternatives that led to TF++, which achieved state-of-the-art results with an 11-point driving score improvement on Longest6.
End-to-end driving systems have recently made rapid progress, in particular on CARLA. Independent of their major contribution, they introduce changes to minor system components. Consequently, the source of improvements is unclear. We identify two biases that recur in nearly all state-of-the-art methods and are critical for the observed progress on CARLA: (1) lateral recovery via a strong inductive bias towards target point following, and (2) longitudinal averaging of multimodal waypoint predictions for slowing down. We investigate the drawbacks of these biases and identify principled alternatives. By incorporating our insights, we develop TF++, a simple end-to-end method that ranks first on the Longest6 and LAV benchmarks, gaining 11 driving score over the best prior work on Longest6.