Revisiting DETR for Small Object Detection via Noise-Resilient Query Optimization
This work improves small object detection for computer vision applications, representing a novel method for a known bottleneck.
The paper tackles the problem of small object detection in Transformer-based detectors by addressing noise sensitivity in feature pyramid networks and poor query quality in label assignment, resulting in a novel Noise-Resilient Query Optimization (NRQO) paradigm that outperforms state-of-the-art baselines on multiple benchmarks.
Despite advancements in Transformer-based detectors for small object detection (SOD), recent studies show that these detectors still face challenges due to inherent noise sensitivity in feature pyramid networks (FPN) and diminished query quality in existing label assignment strategies. In this paper, we propose a novel Noise-Resilient Query Optimization (NRQO) paradigm, which innovatively incorporates the Noise-Tolerance Feature Pyramid Network (NT-FPN) and the Pairwise-Similarity Region Proposal Network (PS-RPN). Specifically, NT-FPN mitigates noise during feature fusion in FPN by preserving spatial and semantic information integrity. Unlike existing label assignment strategies, PS-RPN generates a sufficient number of high-quality positive queries by enhancing anchor-ground truth matching through position and shape similarities, without the need for additional hyperparameters. Extensive experiments on multiple benchmarks consistently demonstrate the superiority of NRQO over state-of-the-art baselines.