STLGMLApr 9, 2021

How rotational invariance of common kernels prevents generalization in high dimensions

arXiv:2104.04244v133 citations
Originality Incremental advance
AI Analysis

This reveals a fundamental limitation for kernel methods in high-dimensional machine learning, indicating that consistency requires analysis beyond eigenvalue decay.

The paper shows that the rotational invariance of common kernels like RBF and NTK causes a bias toward low-degree polynomials in high dimensions, leading to a lower bound on generalization error for kernel ridge regression across various distributions and kernel scalings.

Kernel ridge regression is well-known to achieve minimax optimal rates in low-dimensional settings. However, its behavior in high dimensions is much less understood. Recent work establishes consistency for kernel regression under certain assumptions on the ground truth function and the distribution of the input data. In this paper, we show that the rotational invariance property of commonly studied kernels (such as RBF, inner product kernels and fully-connected NTK of any depth) induces a bias towards low-degree polynomials in high dimensions. Our result implies a lower bound on the generalization error for a wide range of distributions and various choices of the scaling for kernels with different eigenvalue decays. This lower bound suggests that general consistency results for kernel ridge regression in high dimensions require a more refined analysis that depends on the structure of the kernel beyond its eigenvalue decay.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes