CVApr 2

Rethinking Representations for Cross-Domain Infrared Small Target Detection: A Generalizable Perspective from the Frequency Domain

Yimin Fu, Songbo Wang, Feiyan Wu, Jialin Lyu, Zhunga Liu, Michael K. Ng

arXiv:2604.0193426.9h-index: 3

AI Analysis

This addresses the challenge of domain shifts in infrared small target detection for practical applications like surveillance, though it is incremental as it builds on existing methods with novel components.

The paper tackles the problem of cross-domain infrared small target detection by proposing a spatial-spectral collaborative perception network (S²CPNet) that rethinks representations from a frequency perspective, achieving state-of-the-art performance across three datasets under diverse cross-domain settings.

The accurate target-background separation in infrared small target detection (IRSTD) highly depends on the discriminability of extracted representations. However, most existing methods are confined to domain-consistent settings, while overlooking whether such discriminability can generalize to unseen domains. In practice, distribution shifts between training and testing data are inevitable due to variations in observational conditions and environmental factors. Meanwhile, the intrinsic indistinctiveness of infrared small targets aggravates overfitting to domain-specific patterns. Consequently, the detection performance of models trained on source domains can be severely degraded when deployed in unseen domains. To address this challenge, we propose a spatial-spectral collaborative perception network (S$^2$CPNet) for cross-domain IRSTD. Moving beyond conventional spatial learning pipelines, we rethink IRSTD representations from a frequency perspective and reveal inconsistencies in spectral phase as the primary manifestation of domain discrepancies. Based on this insight, we develop a phase rectification module (PRM) to derive generalizable target awareness. Then, we employ an orthogonal attention mechanism (OAM) in skip connections to preserve positional information while refining informative representations. Moreover, the bias toward domain-specific patterns is further mitigated through selective style recomposition (SSR). Extensive experiments have been conducted on three IRSTD datasets, and the proposed method consistently achieves state-of-the-art performance under diverse cross-domain settings.

View on arXiv PDF

Similar