CVApr 23, 2025

Rethinking Generalizable Infrared Small Target Detection: A Real-scene Benchmark and Cross-view Representation Learning

arXiv:2504.16487v12 citationsh-index: 20Has CodeIEEE Trans Geosci Remote Sens
Originality Highly original
AI Analysis

This work addresses generalization challenges in infrared small target detection for surveillance and defense applications, representing an incremental improvement with novel methods for a known bottleneck.

The paper tackles the problem of domain shift in infrared small target detection by introducing a domain adaptation framework with cross-view alignment and noise-resistant learning, achieving superior performance in detection probability, false alarm rate, and IoU compared to state-of-the-art methods.

Infrared small target detection (ISTD) is highly sensitive to sensor type, observation conditions, and the intrinsic properties of the target. These factors can introduce substantial variations in the distribution of acquired infrared image data, a phenomenon known as domain shift. Such distribution discrepancies significantly hinder the generalization capability of ISTD models across diverse scenarios. To tackle this challenge, this paper introduces an ISTD framework enhanced by domain adaptation. To alleviate distribution shift between datasets and achieve cross-sample alignment, we introduce Cross-view Channel Alignment (CCA). Additionally, we propose the Cross-view Top-K Fusion strategy, which integrates target information with diverse background features, enhancing the model' s ability to extract critical data characteristics. To further mitigate the impact of noise on ISTD, we develop a Noise-guided Representation learning strategy. This approach enables the model to learn more noise-resistant feature representations, to improve its generalization capability across diverse noisy domains. Finally, we develop a dedicated infrared small target dataset, RealScene-ISTD. Compared to state-of-the-art methods, our approach demonstrates superior performance in terms of detection probability (Pd), false alarm rate (Fa), and intersection over union (IoU). The code is available at: https://github.com/luy0222/RealScene-ISTD.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes