CVAIROJun 23, 2025

SWA-SOP: Spatially-aware Window Attention for Semantic Occupancy Prediction in Autonomous Driving

arXiv:2506.18785v23 citationsh-index: 5SMC
Originality Incremental advance
AI Analysis

This work addresses the challenge of accurate semantic occupancy prediction for autonomous driving systems, representing an incremental improvement over existing transformer-based methods.

The paper tackles the problem of incomplete 3D environment perception in autonomous driving due to occlusions and data sparsity by proposing Spatially-aware Window Attention (SWA) for Semantic Occupancy Prediction, which achieves state-of-the-art results on LiDAR-based benchmarks and shows consistent gains in camera-based pipelines.

Perception systems in autonomous driving rely on sensors such as LiDAR and cameras to perceive the 3D environment. However, due to occlusions and data sparsity, these sensors often fail to capture complete information. Semantic Occupancy Prediction (SOP) addresses this challenge by inferring both occupancy and semantics of unobserved regions. Existing transformer-based SOP methods lack explicit modeling of spatial structure in attention computation, resulting in limited geometric awareness and poor performance in sparse or occluded areas. To this end, we propose Spatially-aware Window Attention (SWA), a novel mechanism that incorporates local spatial context into attention. SWA significantly improves scene completion and achieves state-of-the-art results on LiDAR-based SOP benchmarks. We further validate its generality by integrating SWA into a camera-based SOP pipeline, where it also yields consistent gains across modalities.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes