Learning to Generate Vectorized Maps at Intersections with Multiple Roadside Cameras
This provides a cost-effective and scalable solution for autonomous navigation systems, addressing the limitations of existing offline and online mapping methods, though it is incremental in improving vision-based approaches.
The paper tackles the problem of generating vectorized maps for autonomous vehicles at intersections by introducing MRC-VMap, a vision-centric neural network that uses multiple roadside cameras to directly convert images into vectorized maps, achieving accuracy comparable to high-cost LiDAR-based methods in experiments on 4,000 intersections across 4 major Chinese metropolitan areas.
Vectorized maps are indispensable for precise navigation and the safe operation of autonomous vehicles. Traditional methods for constructing these maps fall into two categories: offline techniques, which rely on expensive, labor-intensive LiDAR data collection and manual annotation, and online approaches that use onboard cameras to reduce costs but suffer from limited performance, especially at complex intersections. To bridge this gap, we introduce MRC-VMap, a cost-effective, vision-centric, end-to-end neural network designed to generate high-definition vectorized maps directly at intersections. Leveraging existing roadside surveillance cameras, MRC-VMap directly converts time-aligned, multi-directional images into vectorized map representations. This integrated solution lowers the need for additional intermediate modules--such as separate feature extraction and Bird's-Eye View (BEV) conversion steps--thus reducing both computational overhead and error propagation. Moreover, the use of multiple camera views enhances mapping completeness, mitigates occlusions, and provides robust performance under practical deployment constraints. Extensive experiments conducted on 4,000 intersections across 4 major metropolitan areas in China demonstrate that MRC-VMap not only outperforms state-of-the-art online methods but also achieves accuracy comparable to high-cost LiDAR-based approaches, thereby offering a scalable and efficient solution for modern autonomous navigation systems.