On Uniform Convergence and Low-Norm Interpolation Learning
This addresses a theoretical gap in understanding why interpolating predictors work in overparameterized models, though it is incremental as it builds on existing frameworks like Nagarajan and Kolter's.
The paper tackles the problem of explaining the success of minimum-norm interpolating predictors in underdetermined noisy linear regression, showing that uniform convergence in norm balls fails to demonstrate learning, but a weaker notion of uniform convergence for zero-error predictors can bound generalization error for low-norm interpolators.
We consider an underdetermined noisy linear regression model where the minimum-norm interpolating predictor is known to be consistent, and ask: can uniform convergence in a norm ball, or at least (following Nagarajan and Kolter) the subset of a norm ball that the algorithm selects on a typical input set, explain this success? We show that uniformly bounding the difference between empirical and population errors cannot show any learning in the norm ball, and cannot show consistency for any set, even one depending on the exact algorithm and distribution. But we argue we can explain the consistency of the minimal-norm interpolator with a slightly weaker, yet standard, notion: uniform convergence of zero-error predictors in a norm ball. We use this to bound the generalization error of low- (but not minimal-) norm interpolating predictors.