Feature maps for the Laplacian kernel and its generalizations
This work addresses a specific bottleneck in kernel methods for machine learning practitioners, offering incremental improvements in approximating non-separable kernels.
The authors tackled the challenge of approximating the Laplacian kernel and its generalizations, which are not separable and thus difficult to handle with standard random Fourier features methods, by developing efficiently implementable random feature maps with weakly coupled heavy-tailed randomness, demonstrating their efficacy through numerical experiments on real datasets.
Recent applications of kernel methods in machine learning have seen a renewed interest in the Laplacian kernel, due to its stability to the bandwidth hyperparameter in comparison to the Gaussian kernel, as well as its expressivity being equivalent to that of the neural tangent kernel of deep fully connected networks. However, unlike the Gaussian kernel, the Laplacian kernel is not separable. This poses challenges for techniques to approximate it, especially via the random Fourier features (RFF) methodology and its variants. In this work, we provide random features for the Laplacian kernel and its two generalizations: Matérn kernel and the Exponential power kernel. We provide efficiently implementable schemes to sample weight matrices so that random features approximate these kernels. These weight matrices have a weakly coupled heavy-tailed randomness. Via numerical experiments on real datasets we demonstrate the efficacy of these random feature maps.