CVOct 21, 2024

Focus on BEV: Self-calibrated Cycle View Transformation for Monocular Birds-Eye-View Segmentation

arXiv:2410.15932v12 citationsh-index: 5
Originality Incremental advance
AI Analysis

This work improves BEV segmentation for autonomous driving by enhancing view transformation, though it appears incremental with hybrid methods.

The paper tackles the problem of monocular birds-eye-view segmentation by addressing disruptions from BEV-agnostic features, proposing a FocusBEV framework with self-calibrated view transformation and temporal fusion. It achieves state-of-the-art results with 29.2% mIoU on nuScenes and 35.2% mIoU on Argoverse benchmarks.

Birds-Eye-View (BEV) segmentation aims to establish a spatial mapping from the perspective view to the top view and estimate the semantic maps from monocular images. Recent studies have encountered difficulties in view transformation due to the disruption of BEV-agnostic features in image space. To tackle this issue, we propose a novel FocusBEV framework consisting of $(i)$ a self-calibrated cross view transformation module to suppress the BEV-agnostic image areas and focus on the BEV-relevant areas in the view transformation stage, $(ii)$ a plug-and-play ego-motion-based temporal fusion module to exploit the spatiotemporal structure consistency in BEV space with a memory bank, and $(iii)$ an occupancy-agnostic IoU loss to mitigate both semantic and positional uncertainties. Experimental evidence demonstrates that our approach achieves new state-of-the-art on two popular benchmarks,\ie, 29.2\% mIoU on nuScenes and 35.2\% mIoU on Argoverse.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes