Overfitting Behaviour of Gaussian Kernel Ridgeless Regression: Varying Bandwidth or Dimensionality
This addresses the theoretical understanding of overfitting in kernel methods for machine learning practitioners, though it is incremental as it builds on existing risk predictions and assumptions.
The paper investigates the overfitting behavior of Gaussian kernel ridgeless regression under varying bandwidth or dimensionality, showing that with fixed dimensions, the solution is inconsistent and can be worse than a null predictor, while with increasing dimensions, it provides the first example of benign overfitting for sub-polynomial scaling.
We consider the overfitting behavior of minimum norm interpolating solutions of Gaussian kernel ridge regression (i.e. kernel ridgeless regression), when the bandwidth or input dimension varies with the sample size. For fixed dimensions, we show that even with varying or tuned bandwidth, the ridgeless solution is never consistent and, at least with large enough noise, always worse than the null predictor. For increasing dimension, we give a generic characterization of the overfitting behavior for any scaling of the dimension with sample size. We use this to provide the first example of benign overfitting using the Gaussian kernel with sub-polynomial scaling dimension. All our results are under the Gaussian universality ansatz and the (non-rigorous) risk predictions in terms of the kernel eigenstructure.