LGNov 26, 2021

Failure Modes of Domain Generalization Algorithms

Tigran Galstyan, Hrayr Harutyunyan, Hrant Khachatrian, Greg Ver Steeg, Aram Galstyan

arXiv:2111.13733v19.913 citationsHas Code

Originality Incremental advance

AI Analysis

This work addresses the challenge of understanding why domain generalization algorithms often fail to outperform simple baselines, which is crucial for researchers and practitioners in machine learning seeking robust models for unseen domains.

The paper tackles the problem of evaluating domain generalization algorithms by proposing a framework that decomposes error into distinct components, revealing that the largest contributor to generalization error varies across methods, datasets, and training conditions, and identifying two key failure modes in domain-invariant representation learning.

Domain generalization algorithms use training data from multiple domains to learn models that generalize well to unseen domains. While recently proposed benchmarks demonstrate that most of the existing algorithms do not outperform simple baselines, the established evaluation methods fail to expose the impact of various factors that contribute to the poor performance. In this paper we propose an evaluation framework for domain generalization algorithms that allows decomposition of the error into components capturing distinct aspects of generalization. Inspired by the prevalence of algorithms based on the idea of domain-invariant representation learning, we extend the evaluation framework to capture various types of failures in achieving invariance. We show that the largest contributor to the generalization error varies across methods, datasets, regularization strengths and even training lengths. We observe two problems associated with the strategy of learning domain-invariant representations. On Colored MNIST, most domain generalization algorithms fail because they reach domain-invariance only on the training domains. On Camelyon-17, domain-invariance degrades the quality of representations on unseen domains. We hypothesize that focusing instead on tuning the classifier on top of a rich representation can be a promising direction.

View on arXiv PDF Code

Similar