Dynamic Background Reconstruction via MAE for Infrared Small Target Detection
This work addresses a domain-specific problem in infrared imaging for applications like surveillance, but it is incremental as it builds on existing background reconstruction methods with novel modules.
The paper tackles infrared small target detection under complex backgrounds by proposing Dynamic Background Reconstruction (DBR), which uses a masking strategy with Vision Transformers to reconstruct backgrounds and achieves F1-scores of 64.10% on MFIRST and 75.01% on SIRST datasets.
Infrared small target detection (ISTD) under complex backgrounds is a difficult problem, for the differences between targets and backgrounds are not easy to distinguish. Background reconstruction is one of the methods to deal with this problem. This paper proposes an ISTD method based on background reconstruction called Dynamic Background Reconstruction (DBR). DBR consists of three modules: a dynamic shift window module (DSW), a background reconstruction module (BR), and a detection head (DH). BR takes advantage of Vision Transformers in reconstructing missing patches and adopts a grid masking strategy with a masking ratio of 50\% to reconstruct clean backgrounds without targets. To avoid dividing one target into two neighboring patches, resulting in reconstructing failure, DSW is performed before input embedding. DSW calculates offsets, according to which infrared images dynamically shift. To reduce False Positive (FP) cases caused by regarding reconstruction errors as targets, DH utilizes a structure of densely connected Transformer to further improve the detection performance. Experimental results show that DBR achieves the best F1-score on the two ISTD datasets, MFIRST (64.10\%) and SIRST (75.01\%).