LGSep 2, 2016

Doubly stochastic large scale kernel learning with the empirical kernel map

arXiv:1609.00585v2
AI Analysis

This addresses the computational bottleneck for researchers and practitioners using kernel methods on big data, offering a scalable alternative to neural networks, but it appears incremental as it builds on existing stochastic optimization techniques.

The paper tackles the scalability problem of kernel methods by introducing a doubly stochastic optimization approach that uses the empirical kernel map, enabling efficient large-scale kernel learning without discarding data or approximating the kernel map. It demonstrates empirical effectiveness on large datasets, though no specific performance numbers are provided.

With the rise of big data sets, the popularity of kernel methods declined and neural networks took over again. The main problem with kernel methods is that the kernel matrix grows quadratically with the number of data points. Most attempts to scale up kernel methods solve this problem by discarding data points or basis functions of some approximation of the kernel map. Here we present a simple yet effective alternative for scaling up kernel methods that takes into account the entire data set via doubly stochastic optimization of the emprical kernel map. The algorithm is straightforward to implement, in particular in parallel execution settings; it leverages the full power and versatility of classical kernel functions without the need to explicitly formulate a kernel map approximation. We provide empirical evidence that the algorithm works on large data sets.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes