WiSE-OD: Benchmarking Robustness in Infrared Object Detection
This work addresses robustness issues in infrared object detection for low-light and nighttime applications, but it is incremental as it builds on existing methods with new benchmarks and ensembling techniques.
The paper tackles the problem of robustness in infrared object detection by addressing distribution shifts due to the modality gap between RGB and IR, introducing WiSE-OD, a weight-space ensembling method that improves cross-modality and corruption robustness without extra cost.
Object detection (OD) in infrared (IR) imagery is critical for low-light and nighttime applications. However, the scarcity of large-scale IR datasets forces models to rely on weights pre-trained on RGB images. While fine-tuning on IR improves accuracy, it often compromises robustness under distribution shifts due to the inherent modality gap between RGB and IR. To address this, we introduce LLVIP-C and FLIR-C, two cross-modality out-of-distribution (OOD) benchmarks built by applying corruption to standard IR datasets. Additionally, to fully leverage the complementary knowledge from RGB and infrared trained models, we propose WiSE-OD, a weight-space ensembling method with two variants: WiSE-OD$_{ZS}$, which combines RGB zero-shot and IR fine-tuned weights, and WiSE-OD$_{LP}$, which blends zero-shot and linear probing. Evaluated across three RGB-pretrained detectors and two robust baselines, WiSE-OD improves both cross-modality and corruption robustness without any additional training or inference cost.