LGMLFeb 11, 2020

Generalization Guarantees for Sparse Kernel Approximation with Entropic Optimal Features

arXiv:2002.04195v18 citations
Originality Highly original
AI Analysis

This work addresses the computational bottleneck in kernel methods for machine learning practitioners, offering a sparse and efficient approximation with proven generalization guarantees.

The paper tackles the high computational cost of kernel methods by developing a novel optimal design that maximizes entropy among kernel features, resulting in entropic optimal features (EOF) that improve data representation through feature dissimilarity. The result shows that with only O(N^(1/4)) features, it achieves optimal statistical accuracy of O(1/√N), and experiments on benchmark datasets verify EOF's superiority over state-of-the-art kernel approximation methods.

Despite their success, kernel methods suffer from a massive computational cost in practice. In this paper, in lieu of commonly used kernel expansion with respect to $N$ inputs, we develop a novel optimal design maximizing the entropy among kernel features. This procedure results in a kernel expansion with respect to entropic optimal features (EOF), improving the data representation dramatically due to features dissimilarity. Under mild technical assumptions, our generalization bound shows that with only $O(N^{\frac{1}{4}})$ features (disregarding logarithmic factors), we can achieve the optimal statistical accuracy (i.e., $O(1/\sqrt{N})$). The salient feature of our design is its sparsity that significantly reduces the time and space cost. Our numerical experiments on benchmark datasets verify the superiority of EOF over the state-of-the-art in kernel approximation.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes