CVFeb 8, 2024

SpirDet: Towards Efficient, Accurate and Lightweight Infrared Small Target Detector

arXiv:2402.05410v114 citationsh-index: 5IEEE Trans Geosci Remote Sens
Originality Incremental advance
AI Analysis

This addresses computational inefficiency in infrared small target detection for applications like surveillance, with incremental improvements to existing deep learning methods.

The paper tackles the problem of inefficient computation in infrared small target detection by proposing SpirDet, which uses a dual-branch sparse decoder and lightweight encoder to reduce redundancy. The result is a 4.7-point improvement in MIoU and 7× faster inference speed compared to previous state-of-the-art models.

In recent years, the detection of infrared small targets using deep learning methods has garnered substantial attention due to notable advancements. To improve the detection capability of small targets, these methods commonly maintain a pathway that preserves high-resolution features of sparse and tiny targets. However, it can result in redundant and expensive computations. To tackle this challenge, we propose SpirDet, a novel approach for efficient detection of infrared small targets. Specifically, to cope with the computational redundancy issue, we employ a new dual-branch sparse decoder to restore the feature map. Firstly, the fast branch directly predicts a sparse map indicating potential small target locations (occupying only 0.5\% area of the map). Secondly, the slow branch conducts fine-grained adjustments at the positions indicated by the sparse map. Additionally, we design an lightweight DO-RepEncoder based on reparameterization with the Downsampling Orthogonality, which can effectively reduce memory consumption and inference latency. Extensive experiments show that the proposed SpirDet significantly outperforms state-of-the-art models while achieving faster inference speed and fewer parameters. For example, on the IRSTD-1K dataset, SpirDet improves $MIoU$ by 4.7 and has a $7\times$ $FPS$ acceleration compared to the previous state-of-the-art model. The code will be open to the public.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes