LGCVApr 4, 2023

ERM++: An Improved Baseline for Domain Generalization

arXiv:2304.01973v425 citationsh-index: 83Has Code
Originality Incremental advance
AI Analysis

This provides a stronger, simpler baseline for researchers and practitioners in domain generalization, challenging the need for complex methods.

The paper tackles the problem of domain generalization by improving the empirical risk minimization baseline, showing that tuning overlooked factors like training data utilization and initialization boosts performance by over 5% with ResNet-50 and over 15% with ViT-B/16 on standard benchmarks, outperforming state-of-the-art methods.

Domain Generalization (DG) aims to develop classifiers that can generalize to new, unseen data distributions, a critical capability when collecting new domain-specific data is impractical. A common DG baseline minimizes the empirical risk on the source domains. Recent studies have shown that this approach, known as Empirical Risk Minimization (ERM), can outperform most more complex DG methods when properly tuned. However, these studies have primarily focused on a narrow set of hyperparameters, neglecting other factors that can enhance robustness and prevent overfitting and catastrophic forgetting, properties which are critical for strong DG performance. In our investigation of training data utilization (i.e., duration and setting validation splits), initialization, and additional regularizers, we find that tuning these previously overlooked factors significantly improves model generalization across diverse datasets without adding much complexity. We call this improved, yet simple baseline ERM++. Despite its ease of implementation, ERM++ improves DG performance by over 5\% compared to prior ERM baselines on a standard benchmark of 5 datasets with a ResNet-50 and over 15\% with a ViT-B/16. It also outperforms all state-of-the-art methods on DomainBed datasets with both architectures. Importantly, ERM++ is easy to integrate into existing frameworks like DomainBed, making it a practical and powerful tool for researchers and practitioners. Overall, ERM++ challenges the need for more complex DG methods by providing a stronger, more reliable baseline that maintains simplicity and ease of use. Code is available at \url{https://github.com/piotr-teterwak/erm_plusplus}

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes