Double Descent: Understanding Linear Model Estimation of Nonidentifiable Parameters and a Model for Overfitting
It addresses the fundamental challenge of model estimation in overparameterized settings, which is crucial for modern machine learning applications.
The paper examines ordinary least squares and related estimators for high-dimensional problems where the number of parameters exceeds the number of observations, analyzing their behavior in prediction and introducing the double descent phenomenon to explain overfitting patterns.
We consider ordinary least squares estimation and variations on least squares estimation such as penalized (regularized) least squares and spectral shrinkage estimates for problems with p > n and associated problems with prediction of new observations. After the introduction of Section 1, Section 2 examines a number of commonly used estimators for p > n. Section 3 introduces prediction with p > n. Section 4 introduces notational changes to facilitate discussion of overfitting and Section 5 illustrates the phenomenon of double descent. We conclude with some final comments.