CVMar 26

Denoise and Align: Towards Source-Free UDA for Robust Panoramic Semantic Segmentation

Yaowen Chang, Zhen Cao, Xu Zheng, Xiaoxin Mi, Zhen Dong

arXiv:2603.2513146.8h-index: 5

AI Analysis

This addresses the challenge of adapting models to panoramic data without access to source data for applications like autonomous driving and virtual reality, representing a strong specific gain.

The paper tackles the problem of source-free unsupervised domain adaptation for panoramic semantic segmentation, which suffers from geometric distortions and annotation costs, by proposing the DAPASS framework that achieves state-of-the-art performances of 55.04% mIoU on outdoor and 70.38% mIoU on indoor benchmarks.

Panoramic semantic segmentation is pivotal for comprehensive 360Â° scene understanding in critical applications like autonomous driving and virtual reality. However, progress in this domain is constrained by two key challenges: the severe geometric distortions inherent in panoramic projections and the prohibitive cost of dense annotation. While Unsupervised Domain Adaptation (UDA) from label-rich pinhole-camera datasets offers a viable alternative, many real-world tasks impose a stricter source-free (SFUDA) constraint where source data is inaccessible for privacy or proprietary reasons. This constraint significantly amplifies the core problems of domain shift, leading to unreliable pseudo-labels and dramatic performance degradation, particularly for minority classes. To overcome these limitations, we propose the DAPASS framework. DAPASS introduces two synergistic modules to robustly transfer knowledge without source data. First, our Panoramic Confidence-Guided Denoising (PCGD) module generates high-fidelity, class-balanced pseudo-labels by enforcing perturbation consistency and incorporating neighborhood-level confidence to filter noise. Second, a Contextual Resolution Adversarial Module (CRAM) explicitly addresses scale variance and distortion by adversarially aligning fine-grained details from high-resolution crops with global semantics from low-resolution contexts. DAPASS achieves state-of-the-art performances on outdoor (Cityscapes-to-DensePASS) and indoor (Stanford2D3D) benchmarks, yielding 55.04% (+2.05%) and 70.38% (+1.54%) mIoU, respectively.

View on arXiv PDF

Similar