Generalisation and benign over-fitting for linear regression onto random functional covariates
This work addresses theoretical generalization issues in machine learning for researchers, focusing on non-i.i.d. data settings, but it is incremental as it builds on existing results and specific assumptions.
The paper tackles the problem of understanding predictive performance in linear regression with random functional covariates, deriving probabilistic bounds on excess risk and showing how covariate noise influences benign overfitting in regimes where the number of features grows fast relative to sample size.
We study theoretical predictive performance of ridge and ridge-less least-squares regression when covariate vectors arise from evaluating $p$ random, means-square continuous functions over a latent metric space at $n$ random and unobserved locations, subject to additive noise. This leads us away from the standard assumption of i.i.d. data to a setting in which the $n$ covariate vectors are exchangeable but not independent in general. Under an assumption of independence across dimensions, $4$-th order moment, and other regularity conditions, we obtain probabilistic bounds on a notion of predictive excess risk adapted to our random functional covariate setting, making use of recent results of Barzilai and Shamir. We derive convergence rates in regimes where $p$ grows suitably fast relative to $n$, illustrating interplay between ingredients of the model in determining convergence behaviour and the role of additive covariate noise in benign-overfitting.