CVApr 17, 2025

Rethinking Temporal Fusion with a Unified Gradient Descent View for 3D Semantic Occupancy Prediction

Dubing Chen, Huan Zheng, Jin Fang, Xingping Dong, Xianfei Li, Wenlong Liao, Tao He, Pai Peng, Jianbing Shen

arXiv:2504.12959v211 citationsh-index: 23Has CodeCVPR

Originality Highly original

AI Analysis

This addresses the challenge of effectively integrating temporal information in 3D semantic occupancy prediction for autonomous driving systems, representing a strong incremental advance.

The paper tackles the problem of temporal fusion in vision-based 3D semantic occupancy prediction by proposing GDFusion, which identifies three overlooked temporal cues and introduces a novel fusion strategy based on gradient descent. The method achieves 1.4%-4.8% mIoU improvements and reduces memory consumption by 27%-72% on the Occ3D benchmark.

We present GDFusion, a temporal fusion method for vision-based 3D semantic occupancy prediction (VisionOcc). GDFusion opens up the underexplored aspects of temporal fusion within the VisionOcc framework, focusing on both temporal cues and fusion strategies. It systematically examines the entire VisionOcc pipeline, identifying three fundamental yet previously overlooked temporal cues: scene-level consistency, motion calibration, and geometric complementation. These cues capture diverse facets of temporal evolution and make distinct contributions across various modules in the VisionOcc framework. To effectively fuse temporal signals across heterogeneous representations, we propose a novel fusion strategy by reinterpreting the formulation of vanilla RNNs. This reinterpretation leverages gradient descent on features to unify the integration of diverse temporal information, seamlessly embedding the proposed temporal cues into the network. Extensive experiments on nuScenes demonstrate that GDFusion significantly outperforms established baselines. Notably, on Occ3D benchmark, it achieves 1.4\%-4.8\% mIoU improvements and reduces memory consumption by 27\%-72\%.

View on arXiv PDF Code

Similar