CVFeb 20, 2025

RendBEV: Semantic Novel View Synthesis for Self-Supervised Bird's Eye View Segmentation

arXiv:2502.14792v25 citationsh-index: 26WACV
Originality Incremental advance
AI Analysis

This addresses the need for efficient BEV semantic mapping in assisted and autonomous driving by reducing reliance on costly annotated data, though it is incremental as it builds on existing self-supervised and rendering techniques.

The paper tackles the problem of training Bird's Eye View (BEV) semantic segmentation networks without large annotated datasets by introducing RendBEV, a self-supervised method using differentiable volumetric rendering and 2D semantic segmentation supervision. It achieves competitive zero-shot BEV segmentation and sets a new state of the art when fine-tuned on all labels, significantly boosting performance in low-annotation regimes.

Bird's Eye View (BEV) semantic maps have recently garnered a lot of attention as a useful representation of the environment to tackle assisted and autonomous driving tasks. However, most of the existing work focuses on the fully supervised setting, training networks on large annotated datasets. In this work, we present RendBEV, a new method for the self-supervised training of BEV semantic segmentation networks, leveraging differentiable volumetric rendering to receive supervision from semantic perspective views computed by a 2D semantic segmentation model. Our method enables zero-shot BEV semantic segmentation, and already delivers competitive results in this challenging setting. When used as pretraining to then fine-tune on labeled BEV ground-truth, our method significantly boosts performance in low-annotation regimes, and sets a new state of the art when fine-tuning on all available labels.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes