CVNov 12, 2024

ALOcc: Adaptive Lifting-Based 3D Semantic Occupancy and Cost Volume-Based Flow Predictions

Dubing Chen, Jin Fang, Wencheng Han, Xinjing Cheng, Junbo Yin, Chenzhong Xu, Fahad Shahbaz Khan, Jianbing Shen

arXiv:2411.07725v215.815 citationsh-index: 17Has Code

Originality Incremental advance

AI Analysis

This work addresses spatiotemporal scene understanding for autonomous systems, presenting incremental improvements to existing frameworks.

The paper tackles 3D semantic occupancy and flow prediction by proposing a vision-based framework with three improvements: an occlusion-aware adaptive lifting mechanism with depth denoising, 3D-2D semantic consistency enforcement via optimized prototypes, and a BEV-centric cost volume for joint prediction. The method achieves new state-of-the-art performance on multiple benchmarks and offers a real-time version that exceeds existing real-time methods in speed and accuracy.

3D semantic occupancy and flow prediction are fundamental to spatiotemporal scene understanding. This paper proposes a vision-based framework with three targeted improvements. First, we introduce an occlusion-aware adaptive lifting mechanism incorporating depth denoising. This enhances the robustness of 2D-to-3D feature transformation while mitigating reliance on depth priors. Second, we enforce 3D-2D semantic consistency via jointly optimized prototypes, using confidence- and category-aware sampling to address the long-tail classes problem. Third, to streamline joint prediction, we devise a BEV-centric cost volume to explicitly correlate semantic and flow features, supervised by a hybrid classification-regression scheme that handles diverse motion scales. Our purely convolutional architecture establishes new SOTA performance on multiple benchmarks for both semantic occupancy and joint occupancy semantic-flow prediction. We also present a family of models offering a spectrum of efficiency-performance trade-offs. Our real-time version exceeds all existing real-time methods in speed and accuracy, ensuring its practical viability.

View on arXiv PDF Code

Similar