CVAIJun 15, 2023

UniOcc: Unifying Vision-Centric 3D Occupancy Prediction with Geometric and Semantic Rendering

arXiv:2306.09117v136 citationsh-index: 38
Originality Incremental advance
AI Analysis

This work addresses the high cost and resolution limitations of 3D occupancy annotation for autonomous driving applications, offering an incremental improvement over existing methods.

The paper tackles the problem of 3D occupancy prediction from vision-centric data by proposing UniOcc, which unifies geometric and semantic rendering to address limitations in existing methods that rely on expensive and coarse 3D labels. The result is a 51.27% mIoU on the nuScenes dataset, ranking 3rd in a CVPR 2023 challenge.

In this technical report, we present our solution, named UniOCC, for the Vision-Centric 3D occupancy prediction track in the nuScenes Open Dataset Challenge at CVPR 2023. Existing methods for occupancy prediction primarily focus on optimizing projected features on 3D volume space using 3D occupancy labels. However, the generation process of these labels is complex and expensive (relying on 3D semantic annotations), and limited by voxel resolution, they cannot provide fine-grained spatial semantics. To address this limitation, we propose a novel Unifying Occupancy (UniOcc) prediction method, explicitly imposing spatial geometry constraint and complementing fine-grained semantic supervision through volume ray rendering. Our method significantly enhances model performance and demonstrates promising potential in reducing human annotation costs. Given the laborious nature of annotating 3D occupancy, we further introduce a Depth-aware Teacher Student (DTS) framework to enhance prediction accuracy using unlabeled data. Our solution achieves 51.27\% mIoU on the official leaderboard with single model, placing 3rd in this challenge.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes