Closing the Loop: Motion Prediction Models beyond Open-Loop Benchmarks
This work addresses a critical gap for autonomous vehicle developers by showing that current benchmarks are insufficient for evaluating real-world driving performance, though it is incremental in refining evaluation methods.
The paper tackles the problem that open-loop motion prediction accuracy does not guarantee better performance in autonomous driving systems, finding that higher accuracy does not always improve closed-loop behavior and that smaller models with up to 86% fewer parameters can perform comparably or better.
Fueled by motion prediction competitions and benchmarks, recent years have seen the emergence of increasingly large learning based prediction models, many with millions of parameters, focused on improving open-loop prediction accuracy by mere centimeters. However, these benchmarks fail to assess whether such improvements translate to better performance when integrated into an autonomous driving stack. In this work, we systematically evaluate the interplay between state-of-the-art motion predictors and motion planners. Our results show that higher open-loop accuracy does not always correlate with better closed-loop driving behavior and that other factors, such as temporal consistency of predictions and planner compatibility, also play a critical role. Furthermore, we investigate downsized variants of these models, and, surprisingly, find that in some cases models with up to 86% fewer parameters yield comparable or even superior closed-loop driving performance. Our code is available at https://github.com/continental/pred2plan.