MLLGFAOCSTOct 21, 2017

Optimal Rates for Learning with Nyström Stochastic Gradient Methods

arXiv:1710.07797v14 citations
Originality Incremental advance
AI Analysis

This work addresses computational efficiency in large-scale kernel methods for machine learning practitioners, though it appears incremental as it builds on existing Nyström and stochastic gradient approaches.

The paper tackles nonparametric regression by combining stochastic gradient methods with Nyström subsampling, achieving optimal learning rates comparable to state-of-the-art methods while reducing computational complexity.

In the setting of nonparametric regression, we propose and study a combination of stochastic gradient methods with Nyström subsampling, allowing multiple passes over the data and mini-batches. Generalization error bounds for the studied algorithm are provided. Particularly, optimal learning rates are derived considering different possible choices of the step-size, the mini-batch size, the number of iterations/passes, and the subsampling level. In comparison with state-of-the-art algorithms such as the classic stochastic gradient methods and kernel ridge regression with Nyström, the studied algorithm has advantages on the computational complexity, while achieving the same optimal learning rates. Moreover, our results indicate that using mini-batches can reduce the total computational cost while achieving the same optimal statistical results.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes