CV LGApr 22, 2024

OccFeat: Self-supervised Occupancy Feature Prediction for Pretraining BEV Segmentation Networks

Sophia Sirko-Galouchenko, Alexandre Boulch, Spyros Gidaris, Andrei Bursuc, Antonin Vobecky, Patrick Pérez, Renaud Marlet

arXiv:2404.14027v317.322 citationsh-index: 36Has Code2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)

Originality Incremental advance

AI Analysis

This work addresses the challenge of training BEV segmentation networks with limited labeled data, which is incremental as it builds on existing pretraining and distillation techniques.

The paper tackles the problem of improving Bird's-Eye-View (BEV) semantic segmentation for camera-only systems by introducing OccFeat, a self-supervised pretraining method that combines occupancy prediction and feature distillation, resulting in enhanced performance especially in low-data scenarios.

We introduce a self-supervised pretraining method, called OccFeat, for camera-only Bird's-Eye-View (BEV) segmentation networks. With OccFeat, we pretrain a BEV network via occupancy prediction and feature distillation tasks. Occupancy prediction provides a 3D geometric understanding of the scene to the model. However, the geometry learned is class-agnostic. Hence, we add semantic information to the model in the 3D space through distillation from a self-supervised pretrained image foundation model. Models pretrained with our method exhibit improved BEV semantic segmentation performance, particularly in low-data scenarios. Moreover, empirical results affirm the efficacy of integrating feature distillation with 3D occupancy prediction in our pretraining approach. Repository: https://github.com/valeoai/Occfeat

View on arXiv PDF Code

Similar