Risk Phase Transitions in Spiked Regression: Alignment Driven Benign and Catastrophic Overfitting
This work addresses the problem of understanding overfitting transitions in overparameterized regression for machine learning theorists, providing a theoretical framework that is incremental but clarifies specific conditions.
The paper analyzes generalization error in linear regression with spiked covariance models, deriving an exact expression that classifies regimes into benign, tempered, and catastrophic overfitting based on spike strength, aspect ratio, and target alignment, showing that increasing spike strength can induce catastrophic overfitting in aligned problems.
This paper analyzes the generalization error of minimum-norm interpolating solutions in linear regression using spiked covariance data models. The paper characterizes how varying spike strengths and target-spike alignments can affect risk, especially in overparameterized settings. The study presents an exact expression for the generalization error, leading to a comprehensive classification of benign, tempered, and catastrophic overfitting regimes based on spike strength, the aspect ratio $c=d/n$ (particularly as $c \to \infty$), and target alignment. Notably, in well-specified aligned problems, increasing spike strength can surprisingly induce catastrophic overfitting before achieving benign overfitting. The paper also reveals that target-spike alignment is not always advantageous, identifying specific, sometimes counterintuitive, conditions for its benefit or detriment. Alignment with the spike being detrimental is empirically demonstrated to persist in nonlinear models.