LGCVMLOct 29, 2020

Understanding the Failure Modes of Out-of-Distribution Generalization

arXiv:2010.15775v3208 citations
Originality Incremental advance
AI Analysis

It addresses the problem of poor generalization in AI for researchers, providing incremental insights into failure mechanisms.

The paper investigates why machine learning models fail to generalize out-of-distribution due to spurious correlations, identifying two failure modes (geometric and statistical) through theoretical analysis of linear classifiers and validating them with modified image datasets and neural network experiments.

Empirical studies suggest that machine learning models often rely on features, such as the background, that may be spuriously correlated with the label only during training time, resulting in poor accuracy during test-time. In this work, we identify the fundamental factors that give rise to this behavior, by explaining why models fail this way {\em even} in easy-to-learn tasks where one would expect these models to succeed. In particular, through a theoretical study of gradient-descent-trained linear classifiers on some easy-to-learn tasks, we uncover two complementary failure modes. These modes arise from how spurious correlations induce two kinds of skews in the data: one geometric in nature, and another, statistical in nature. Finally, we construct natural modifications of image classification datasets to understand when these failure modes can arise in practice. We also design experiments to isolate the two failure modes when training modern neural networks on these datasets.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes