DG-DETR: Toward Domain Generalized Detection Transformer
This work addresses domain generalization for object detection in computer vision, which is an incremental advancement focusing on a specific domain.
The paper tackles the problem of domain generalization for Transformer-based object detectors (DETRs), which had been understudied compared to CNN-based methods, and introduces DG-DETR, a plug-and-play approach that improves out-of-distribution robustness, as validated by experimental results.
End-to-end Transformer-based detectors (DETRs) have demonstrated strong detection performance. However, domain generalization (DG) research has primarily focused on convolutional neural network (CNN)-based detectors, while paying little attention to enhancing the robustness of DETRs. In this letter, we introduce a Domain Generalized DEtection TRansformer (DG-DETR), a simple, effective, and plug-and-play method that improves out-of-distribution (OOD) robustness for DETRs. Specifically, we propose a novel domain-agnostic query selection strategy that removes domain-induced biases from object queries via orthogonal projection onto the instance-specific style space. Additionally, we leverage a wavelet decomposition to disentangle features into domain-invariant and domain-specific components, enabling synthesis of diverse latent styles while preserving the semantic features of objects. Experimental results validate the effectiveness of DG-DETR. Our code is available at https://github.com/sminhwang/DG-DETR.