CVFeb 7, 2024

G-NAS: Generalizable Neural Architecture Search for Single Domain Generalization Object Detection

arXiv:2402.04672v133 citationsh-index: 17Has CodeAAAI
Originality Incremental advance
AI Analysis

This addresses a realistic challenge in object detection for applications like autonomous driving, where training data is limited to one domain but must work across diverse environments, though it is incremental as it builds on existing NAS methods.

The paper tackles the problem of Single Domain Generalization Object Detection (S-DGOD), where object detectors trained on only one source domain must generalize to multiple unseen target domains, and proposes G-NAS with a Generalizable loss to prevent over-fitting, achieving state-of-the-art performance on urban-scene datasets.

In this paper, we focus on a realistic yet challenging task, Single Domain Generalization Object Detection (S-DGOD), where only one source domain's data can be used for training object detectors, but have to generalize multiple distinct target domains. In S-DGOD, both high-capacity fitting and generalization abilities are needed due to the task's complexity. Differentiable Neural Architecture Search (NAS) is known for its high capacity for complex data fitting and we propose to leverage Differentiable NAS to solve S-DGOD. However, it may confront severe over-fitting issues due to the feature imbalance phenomenon, where parameters optimized by gradient descent are biased to learn from the easy-to-learn features, which are usually non-causal and spuriously correlated to ground truth labels, such as the features of background in object detection data. Consequently, this leads to serious performance degradation, especially in generalizing to unseen target domains with huge domain gaps between the source domain and target domains. To address this issue, we propose the Generalizable loss (G-loss), which is an OoD-aware objective, preventing NAS from over-fitting by using gradient descent to optimize parameters not only on a subset of easy-to-learn features but also the remaining predictive features for generalization, and the overall framework is named G-NAS. Experimental results on the S-DGOD urban-scene datasets demonstrate that the proposed G-NAS achieves SOTA performance compared to baseline methods. Codes are available at https://github.com/wufan-cse/G-NAS.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes