Tight Kernel Query Complexity of Kernel Ridge Regression and Kernel $k$-means Clustering
This work addresses fundamental computational limits in kernel methods, which is crucial for researchers and practitioners in machine learning, though it is incremental in refining existing bounds.
The paper presents tight lower bounds on kernel evaluations for kernel ridge regression and kernel k-means clustering, showing Ω(nd_eff^λ/ε) for KRR and Ω(nk/ε) for KKMC, and resolves an open question on sampling complexity while also providing an algorithm that bypasses the lower bound for Gaussian mixture inputs.
We present tight lower bounds on the number of kernel evaluations required to approximately solve kernel ridge regression (KRR) and kernel $k$-means clustering (KKMC) on $n$ input points. For KRR, our bound for relative error approximation to the minimizer of the objective function is $Ω(nd_{\mathrm{eff}}^λ/\varepsilon)$ where $d_{\mathrm{eff}}^λ$ is the effective statistical dimension, which is tight up to a $\log(d_{\mathrm{eff}}^λ/\varepsilon)$ factor. For KKMC, our bound for finding a $k$-clustering achieving a relative error approximation of the objective function is $Ω(nk/\varepsilon)$, which is tight up to a $\log(k/\varepsilon)$ factor. Our KRR result resolves a variant of an open question of El Alaoui and Mahoney, asking whether the effective statistical dimension is a lower bound on the sampling complexity or not. Furthermore, for the important practical case when the input is a mixture of Gaussians, we provide a KKMC algorithm which bypasses the above lower bound.