CVSep 16, 2024

Robust Bird's Eye View Segmentation by Adapting DINOv2

arXiv:2409.10228v18 citationsh-index: 6
Originality Incremental advance
AI Analysis

This work addresses robustness issues in BEV perception for autonomous driving, but it is incremental as it builds on existing frameworks like SimpleBEV.

The paper tackles the problem of performance degradation in Bird's Eye View (BEV) segmentation for autonomous driving under corruptions like brightness changes and camera failures, by adapting DINOv2 with LoRA, resulting in increased robustness with gains from scaling model size and input resolution.

Extracting a Bird's Eye View (BEV) representation from multiple camera images offers a cost-effective, scalable alternative to LIDAR-based solutions in autonomous driving. However, the performance of the existing BEV methods drops significantly under various corruptions such as brightness and weather changes or camera failures. To improve the robustness of BEV perception, we propose to adapt a large vision foundational model, DINOv2, to BEV estimation using Low Rank Adaptation (LoRA). Our approach builds on the strong representation space of DINOv2 by adapting it to the BEV task in a state-of-the-art framework, SimpleBEV. Our experiments show increased robustness of BEV perception under various corruptions, with increasing gains from scaling up the model and the input resolution. We also showcase the effectiveness of the adapted representations in terms of fewer learnable parameters and faster convergence during training.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes