Kernel interpolation generalizes poorly
This addresses a foundational question in understanding benign overfitting in deep learning, but the result is incremental as it provides a negative bound rather than a new method.
The paper tackles the problem of whether kernel interpolation can generalize well, showing that under mild conditions, its generalization error is lower bounded by Ω(n^{-ε}) for any ε>0, meaning it generalizes poorly for many kernels.
One of the most interesting problems in the recent renaissance of the studies in kernel regression might be whether the kernel interpolation can generalize well, since it may help us understand the `benign overfitting henomenon' reported in the literature on deep networks. In this paper, under mild conditions, we show that for any $\varepsilon>0$, the generalization error of kernel interpolation is lower bounded by $Ω(n^{-\varepsilon})$. In other words, the kernel interpolation generalizes poorly for a large class of kernels. As a direct corollary, we can show that overfitted wide neural networks defined on the sphere generalize poorly.