Failure Modes for Deep Learning-Based Online Mapping: How to Measure and Address Them

arXiv:2603.1985238.3h-index: 4

Predicted impact top 80% in CV · last 90 daysOriginality Incremental advance

AI Analysis

This addresses generalization issues in autonomous driving mapping, offering incremental improvements through better evaluation and dataset design.

The paper tackles the problem of deep learning-based online mapping models failing to generalize beyond familiar environments by proposing a framework to measure failure modes like memorization and overfitting, and shows that map geometry-diverse training sets improve performance on datasets like nuScenes and Argoverse 2.

Deep learning-based online mapping has emerged as a cornerstone of autonomous driving, yet these models frequently fail to generalize beyond familiar environments. We propose a framework to identify and measure the underlying failure modes by disentangling two effects: Memorization of input features and overfitting to known map geometries. We propose measures based on evaluation subsets that control for geographical proximity and geometric similarity between training and validation scenes. We introduce FrÃ©chet distance-based reconstruction statistics that capture per-element shape fidelity without threshold tuning, and define complementary failure-mode scores: a localization overfitting score quantifying the performance drop when geographic cues disappear, and a map geometry overfitting score measuring degradation as scenes become geometrically novel. Beyond models, we analyze dataset biases and contribute map geometry-aware diagnostics: A minimum-spanning-tree (MST) diversity measure for training sets and a symmetric coverage measure to quantify geometric similarity between splits. Leveraging these, we formulate an MST-based sparsification strategy that reduces redundancy and improves balancing and performance while shrinking training size. Experiments on nuScenes and Argoverse 2 across multiple state-of-the-art models yield more trustworthy assessment of generalization and show that map geometry-diverse and balanced training sets lead to improved performance. Our results motivate failure-mode-aware protocols and map geometry-centric dataset design for deployable online mapping.

View on arXiv PDF

Similar