ML LGNov 5, 2025

A general technique for approximating high-dimensional empirical kernel matrices

Chiraag Kaushik, Justin Romberg, Vidya Muthukumar

arXiv:2511.03892v17.81 citationsh-index: 15

Originality Incremental advance

AI Analysis

This provides theoretical tools for analyzing kernel methods in high-dimensional settings, which is important for machine learning practitioners working with kernel-based algorithms.

The paper develops a general technique for approximating high-dimensional empirical kernel matrices using decoupling results and the non-commutative Khintchine inequality to provide upper and lower bounds. It applies this method to obtain tighter approximations for inner-product kernel matrices on high-dimensional data and shows a tighter lower bound on kernel regression bias with anisotropic Gaussian data.

We present simple, user-friendly bounds for the expected operator norm of a random kernel matrix under general conditions on the kernel function $k(\cdot,\cdot)$. Our approach uses decoupling results for U-statistics and the non-commutative Khintchine inequality to obtain upper and lower bounds depending only on scalar statistics of the kernel function and a ``correlation kernel'' matrix corresponding to $k(\cdot,\cdot)$. We then apply our method to provide new, tighter approximations for inner-product kernel matrices on general high-dimensional data, where the sample size and data dimension are polynomially related. Our method obtains simplified proofs of existing results that rely on the moment method and combinatorial arguments while also providing novel approximation results for the case of anisotropic Gaussian data. Finally, using similar techniques to our approximation result, we show a tighter lower bound on the bias of kernel regression with anisotropic Gaussian data.

View on arXiv PDF

Similar