CausalAgents: A Robustness Benchmark for Motion Forecasting using Causal Relationships
This work addresses the problem of model robustness in motion forecasting for autonomous vehicles, providing a benchmark and labels to improve safety, though it is incremental as it builds on existing datasets and methods.
The authors tackled the challenge of ensuring safe and reliable motion forecasting models for autonomous vehicles by creating a benchmark that applies perturbations to existing data, revealing that state-of-the-art models show a 25-38% relative change in minADE under non-causal perturbations.
As machine learning models become increasingly prevalent in motion forecasting for autonomous vehicles (AVs), it is critical to ensure that model predictions are safe and reliable. However, exhaustively collecting and labeling the data necessary to fully test the long tail of rare and challenging scenarios is difficult and expensive. In this work, we construct a new benchmark for evaluating and improving model robustness by applying perturbations to existing data. Specifically, we conduct an extensive labeling effort to identify causal agents, or agents whose presence influences human drivers' behavior in any format, in the Waymo Open Motion Dataset (WOMD), and we use these labels to perturb the data by deleting non-causal agents from the scene. We evaluate a diverse set of state-of-the-art deep-learning model architectures on our proposed benchmark and find that all models exhibit large shifts under even non-causal perturbation: we observe a 25-38% relative change in minADE as compared to the original. We also investigate techniques to improve model robustness, including increasing the training dataset size and using targeted data augmentations that randomly drop non-causal agents throughout training. Finally, we release the causal agent labels (at https://github.com/google-research/causal-agents) as an additional attribute to WOMD and the robustness benchmarks to aid the community in building more reliable and safe deep-learning models for motion forecasting.