LGSPMay 1, 2022

On the speed of uniform convergence in Mercer's theorem

arXiv:2205.00487v211 citationsh-index: 8
Originality Synthesis-oriented
AI Analysis

This work provides theoretical convergence rates for kernel methods in machine learning, which is incremental but useful for practitioners in kernel-based algorithms.

The paper tackles the problem of estimating the speed of uniform convergence in Mercer's theorem for kernel representations, deriving bounds such as O((∑_{i=N+1}^∞ λ_i)^{m/(m+n)}) for differentiable kernels and demonstrating applications to spectral characterizations of integral operators.

The classical Mercer's theorem claims that a continuous positive definite kernel $K({\mathbf x}, {\mathbf y})$ on a compact set can be represented as $\sum_{i=1}^\infty λ_iφ_i({\mathbf x})φ_i({\mathbf y})$ where $\{(λ_i,φ_i)\}$ are eigenvalue-eigenvector pairs of the corresponding integral operator. This infinite representation is known to converge uniformly to the kernel $K$. We estimate the speed of this convergence in terms of the decay rate of eigenvalues and demonstrate that for $2m$ times differentiable kernels the first $N$ terms of the series approximate $K$ as $\mathcal{O}\big((\sum_{i=N+1}^\inftyλ_i)^{\frac{m}{m+n}}\big)$ or $\mathcal{O}\big((\sum_{i=N+1}^\inftyλ^2_i)^{\frac{m}{2m+n}}\big)$. Finally, we demonstrate some applications of our results to a spectral charaterization of integral operators with continuous roots and other powers.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes