LGPRCOMLNov 5, 2024

A spectral mixture representation of isotropic kernels to generalize random Fourier features

arXiv:2411.02770v33 citationsh-index: 2
Originality Incremental advance
AI Analysis

This work addresses a bottleneck in kernel methods for machine learning practitioners by generalizing RFF to more kernel types, though it is incremental as it builds on existing RFF theory.

The paper tackles the problem of extending Random Fourier Features (RFF) beyond the Gaussian kernel by showing that isotropic kernels can be decomposed as scale mixtures of α-stable random vectors, providing a simple spectral sampling formula for kernels like exponential power and Matérn. This enables broader application of RFF in kernel-based methods such as support vector machines and Gaussian processes.

Rahimi and Recht (2007) introduced the idea of decomposing positive definite shift-invariant kernels by randomly sampling from their spectral distribution. This famous technique, known as Random Fourier Features (RFF), is in principle applicable to any such kernel whose spectral distribution can be identified and simulated. In practice, however, it is usually applied to the Gaussian kernel because of its simplicity, since its spectral distribution is also Gaussian. Clearly, simple spectral sampling formulas would be desirable for broader classes of kernels. In this paper, we show that the spectral distribution of positive definite isotropic kernels in $\mathbb{R}^{d}$ for all $d\geq1$ can be decomposed as a scale mixture of $α$-stable random vectors, and we identify the mixing distribution as a function of the kernel. This constructive decomposition provides a simple and ready-to-use spectral sampling formula for many multivariate positive definite shift-invariant kernels, including exponential power kernels, generalized Matérn kernels, generalized Cauchy kernels, as well as newly introduced kernels such as the Beta, Kummer, and Tricomi kernels. In particular, we retrieve the fact that the spectral distributions of these kernels are scale mixtures of the multivariate Gaussian distribution, along with an explicit mixing distribution formula. This result has broad applications for support vector machines, kernel ridge regression, Gaussian processes, and other kernel-based machine learning techniques for which the random Fourier features technique is applicable.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes