CVROMar 21, 2025

SGFormer: Satellite-Ground Fusion for 3D Semantic Scene Completion

arXiv:2503.16825v24 citationsh-index: 11Has CodeCVPR
Originality Incremental advance
AI Analysis

This work addresses scene understanding for autonomous driving by improving semantic completion in occluded areas, representing an incremental advance through novel fusion of satellite and ground data.

The paper tackles the problem of incomplete scene semantics in camera-based semantic scene completion due to visual occlusions by introducing SGFormer, a satellite-ground cooperative framework that fuses satellite and ground views, achieving state-of-the-art performance on SemanticKITTI and SSCBench-KITTI-360 datasets.

Recently, camera-based solutions have been extensively explored for scene semantic completion (SSC). Despite their success in visible areas, existing methods struggle to capture complete scene semantics due to frequent visual occlusions. To address this limitation, this paper presents the first satellite-ground cooperative SSC framework, i.e., SGFormer, exploring the potential of satellite-ground image pairs in the SSC task. Specifically, we propose a dual-branch architecture that encodes orthogonal satellite and ground views in parallel, unifying them into a common domain. Additionally, we design a ground-view guidance strategy that corrects satellite image biases during feature encoding, addressing misalignment between satellite and ground views. Moreover, we develop an adaptive weighting strategy that balances contributions from satellite and ground views. Experiments demonstrate that SGFormer outperforms the state of the art on SemanticKITTI and SSCBench-KITTI-360 datasets. Our code is available on https://github.com/gxytcrc/SGFormer.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes