LG MLFeb 11, 2020

Generalization Guarantees for Sparse Kernel Approximation with Entropic Optimal Features

arXiv:2002.04195v15.08 citations

Originality Highly original

AI Analysis

This work addresses the computational bottleneck in kernel methods for machine learning practitioners, offering a sparse and efficient approximation with proven generalization guarantees.

The paper tackles the high computational cost of kernel methods by developing a novel optimal design that maximizes entropy among kernel features, resulting in entropic optimal features (EOF) that improve data representation through feature dissimilarity. The result shows that with only O(N^(1/4)) features, it achieves optimal statistical accuracy of O(1/√N), and experiments on benchmark datasets verify EOF's superiority over state-of-the-art kernel approximation methods.

Despite their success, kernel methods suffer from a massive computational cost in practice. In this paper, in lieu of commonly used kernel expansion with respect to $N$ inputs, we develop a novel optimal design maximizing the entropy among kernel features. This procedure results in a kernel expansion with respect to entropic optimal features (EOF), improving the data representation dramatically due to features dissimilarity. Under mild technical assumptions, our generalization bound shows that with only $O(N^{\frac{1}{4}})$ features (disregarding logarithmic factors), we can achieve the optimal statistical accuracy (i.e., $O(1/\sqrt{N})$). The salient feature of our design is its sparsity that significantly reduces the time and space cost. Our numerical experiments on benchmark datasets verify the superiority of EOF over the state-of-the-art in kernel approximation.

View on arXiv PDF

Similar