Kernel approximation on algebraic varieties
This addresses a fundamental mathematical problem with broad algorithmic applications in data science, enabling more efficient solutions for large-scale problems by exploiting algebraic structure in datasets.
The paper tackles the problem of low-rank kernel approximation when the kernel is restricted to an algebraic variety, such as in sparse or low-rank data, showing that the required rank depends on the variety's dimension rather than the larger ambient dimension, leading to significantly better approximations in both high-precision and high-dimensional regimes.
Low-rank approximation of kernels is a fundamental mathematical problem with widespread algorithmic applications. Often the kernel is restricted to an algebraic variety, e.g., in problems involving sparse or low-rank data. We show that significantly better approximations are obtainable in this setting: the rank required to achieve a given error depends on the variety's dimension rather than the ambient dimension, which is typically much larger. This is true in both high-precision and high-dimensional regimes. Our results are presented for smooth isotropic kernels, the predominant class of kernels used in applications. Our main technical insight is to approximate smooth kernels by polynomial kernels, and leverage two key properties of polynomial kernels that hold when they are restricted to a variety. First, their ranks decrease exponentially in the variety's co-dimension. Second, their maximum values are governed by their values over a small set of points. Together, our results provide a general approach for exploiting (approximate) "algebraic structure" in datasets in order to efficiently solve large-scale data science problems.