CVJul 10, 2024

IRSAM: Advancing Segment Anything Model for Infrared Small Target Detection

arXiv:2407.07520v1163 citationsh-index: 21Has Code
Originality Incremental advance
AI Analysis

This work addresses infrared small target detection, a domain-specific problem in computer vision, with incremental improvements to an existing model.

The authors tackled the problem of adapting the Segment Anything Model (SAM) for infrared small target detection by addressing the domain gap between natural and infrared images, resulting in the IRSAM model that outperforms state-of-the-art methods on public datasets like NUAA-SIRST, NUDT-SIRST, and IRSTD-1K.

The recent Segment Anything Model (SAM) is a significant advancement in natural image segmentation, exhibiting potent zero-shot performance suitable for various downstream image segmentation tasks. However, directly utilizing the pretrained SAM for Infrared Small Target Detection (IRSTD) task falls short in achieving satisfying performance due to a notable domain gap between natural and infrared images. Unlike a visible light camera, a thermal imager reveals an object's temperature distribution by capturing infrared radiation. Small targets often show a subtle temperature transition at the object's boundaries. To address this issue, we propose the IRSAM model for IRSTD, which improves SAM's encoder-decoder architecture to learn better feature representation of infrared small objects. Specifically, we design a Perona-Malik diffusion (PMD)-based block and incorporate it into multiple levels of SAM's encoder to help it capture essential structural features while suppressing noise. Additionally, we devise a Granularity-Aware Decoder (GAD) to fuse the multi-granularity feature from the encoder to capture structural information that may be lost in long-distance modeling. Extensive experiments on the public datasets, including NUAA-SIRST, NUDT-SIRST, and IRSTD-1K, validate the design choice of IRSAM and its significant superiority over representative state-of-the-art methods. The source code are available at: github.com/IPIC-Lab/IRSAM.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes