MLDIS-NNLGJun 4, 2024

Demystifying Spectral Bias on Real-World Data

arXiv:2406.02663v22 citations
Originality Synthesis-oriented
AI Analysis

This work addresses a fundamental problem in machine learning for researchers and practitioners using kernel methods, but it appears incremental as it builds on existing spectral theory without introducing a new paradigm.

The paper tackled the challenge of analyzing spectral bias in kernel methods on real-world data by proposing to use eigenvalues and eigenfunctions from idealized data measures to bound learnability, enabling insights into how symmetries in realistic kernels affect learning.

Kernel ridge regression (KRR) and Gaussian processes (GPs) are fundamental tools in statistics and machine learning, with recent applications to highly over-parameterized deep neural networks. The ability of these tools to learn a target function is directly related to the eigenvalues of their kernel sampled on the input data distribution. Targets that have support on higher eigenvalues are more learnable. However, solving such eigenvalue problems on real-world data remains a challenge. Here, we consider cross-dataset learnability and show that one may use eigenvalues and eigenfunctions associated with highly idealized data measures to reveal spectral bias on complex datasets and bound learnability on real-world data. This allows us to leverage various symmetries that realistic kernels manifest to unravel their spectral bias.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes