LGSep 8, 2017

Gaussian Quadrature for Kernel Features

Tri Dao, Christopher De Sa, Christopher Ré

arXiv:1709.02605v354 citations

Originality Incremental advance

AI Analysis

This addresses the computational bottleneck in scaling kernel methods for practitioners, offering faster feature generation with competitive performance, though it is incremental as it builds on existing kernel techniques.

The paper tackles the inefficiency of random Fourier features in kernel methods by proposing deterministic feature maps using Gaussian quadrature, achieving error ε with O(e^{e^γ} + ε^{-1/γ}) samples and showing comparable accuracy to state-of-the-art methods on datasets like MNIST and TIMIT.

Kernel methods have recently attracted resurgent interest, showing performance competitive with deep neural networks in tasks such as speech recognition. The random Fourier features map is a technique commonly used to scale up kernel machines, but employing the randomized feature map means that $O(ε^{-2})$ samples are required to achieve an approximation error of at most $ε$. We investigate some alternative schemes for constructing feature maps that are deterministic, rather than random, by approximating the kernel in the frequency domain using Gaussian quadrature. We show that deterministic feature maps can be constructed, for any $γ> 0$, to achieve error $ε$ with $O(e^{e^γ} + ε^{-1/γ})$ samples as $ε$ goes to 0. Our method works particularly well with sparse ANOVA kernels, which are inspired by the convolutional layer of CNNs. We validate our methods on datasets in different domains, such as MNIST and TIMIT, showing that deterministic features are faster to generate and achieve accuracy comparable to the state-of-the-art kernel methods based on random Fourier features.

View on arXiv PDF

Similar