MapRF: Weakly Supervised Online HD Map Construction via NeRF-Guided Self-Training
This work addresses the scalability and cost issues in HD map construction for autonomous driving, though it is incremental as it builds on existing weakly supervised and NeRF-based methods.
The paper tackles the problem of costly 3D map annotations for online HD map construction in autonomous driving by proposing MapRF, a weakly supervised framework that uses only 2D image labels, achieving around 75% of fully supervised baseline performance on datasets like Argoverse 2 and nuScenes.
Autonomous driving systems benefit from high-definition (HD) maps that provide critical information about road infrastructure. The online construction of HD maps offers a scalable approach to generate local maps from on-board sensors. However, existing methods typically rely on costly 3D map annotations for training, which limits their generalization and scalability across diverse driving environments. In this work, we propose MapRF, a weakly supervised framework that learns to construct 3D maps using only 2D image labels. To generate high-quality pseudo labels, we introduce a novel Neural Radiance Fields (NeRF) module conditioned on map predictions, which reconstructs view-consistent 3D geometry and semantics. These pseudo labels are then iteratively used to refine the map network in a self-training manner, enabling progressive improvement without additional supervision. Furthermore, to mitigate error accumulation during self-training, we propose a Map-to-Ray Matching strategy that aligns map predictions with camera rays derived from 2D labels. Extensive experiments on the Argoverse 2 and nuScenes datasets demonstrate that MapRF achieves performance comparable to fully supervised methods, attaining around 75% of the baseline while surpassing several approaches using only 2D labels. This highlights the potential of MapRF to enable scalable and cost-effective online HD map construction for autonomous driving.