CVDec 23, 2024

Learning Dynamic Local Context Representations for Infrared Small Target Detection

Guoyi Zhang, Guangsheng Xu, Han Wang, Siyang Chen, Yunxiao Shan, Xiaohu Zhang

arXiv:2412.17401v17.6h-index: 5

Originality Incremental advance

AI Analysis

This addresses the problem of detecting small targets in infrared images for applications like surveillance, though it appears incremental as it builds on existing CNN and transformer approaches.

The paper tackles infrared small target detection by proposing LCRNet, a method that learns dynamic local context representations, achieving state-of-the-art performance with only 1.65M parameters.

Infrared small target detection (ISTD) is challenging due to complex backgrounds, low signal-to-clutter ratios, and varying target sizes and shapes. Effective detection relies on capturing local contextual information at the appropriate scale. However, small-kernel CNNs have limited receptive fields, leading to false alarms, while transformer models, with global receptive fields, often treat small targets as noise, resulting in miss-detections. Hybrid models struggle to bridge the semantic gap between CNNs and transformers, causing high complexity.To address these challenges, we propose LCRNet, a novel method that learns dynamic local context representations for ISTD. The model consists of three components: (1) C2FBlock, inspired by PDE solvers, for efficient small target information capture; (2) DLC-Attention, a large-kernel attention mechanism that dynamically builds context and reduces feature redundancy; and (3) HLKConv, a hierarchical convolution operator based on large-kernel decomposition that preserves sparsity and mitigates the drawbacks of dilated convolutions. Despite its simplicity, with only 1.65M parameters, LCRNet achieves state-of-the-art (SOTA) performance.Experiments on multiple datasets, comparing LCRNet with 33 SOTA methods, demonstrate its superior performance and efficiency.

View on arXiv PDF

Similar