LGCVMLMay 9, 2020

An Investigation of Why Overparameterization Exacerbates Spurious Correlations

arXiv:2005.04345v3444 citations
AI Analysis

This addresses fairness and robustness issues in machine learning for practitioners using overparameterized models, though it is incremental as it builds on existing research on spurious correlations and overparameterization.

The paper investigates why overparameterization worsens test error on minority groups in the presence of spurious correlations, identifying data properties like group proportions and signal-to-noise ratio as key drivers, and proposes subsampling the majority group as a counterintuitive solution that reduces minority error.

We study why overparameterization -- increasing model size well beyond the point of zero training error -- can hurt test error on minority groups despite improving average test error when there are spurious correlations in the data. Through simulations and experiments on two image datasets, we identify two key properties of the training data that drive this behavior: the proportions of majority versus minority groups, and the signal-to-noise ratio of the spurious correlations. We then analyze a linear setting and theoretically show how the inductive bias of models towards "memorizing" fewer examples can cause overparameterization to hurt. Our analysis leads to a counterintuitive approach of subsampling the majority group, which empirically achieves low minority error in the overparameterized regime, even though the standard approach of upweighting the minority fails. Overall, our results suggest a tension between using overparameterized models versus using all the training data for achieving low worst-group error.

Code Implementations3 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes