MLLGJul 17, 2018

Learning with SGD and Random Features

arXiv:1807.06343v384 citations
Originality Incremental advance
AI Analysis

This work provides theoretical insights for efficient large-scale learning algorithms, but it is incremental as it builds on existing sketching and stochastic gradient techniques.

The paper tackles the problem of nonparametric statistical learning by analyzing an estimator based on stochastic gradient descent with mini-batches and random features, deriving optimal finite sample bounds under standard assumptions.

Sketching and stochastic gradient methods are arguably the most common techniques to derive efficient large scale learning algorithms. In this paper, we investigate their application in the context of nonparametric statistical learning. More precisely, we study the estimator defined by stochastic gradient with mini batches and random features. The latter can be seen as form of nonlinear sketching and used to define approximate kernel methods. The considered estimator is not explicitly penalized/constrained and regularization is implicit. Indeed, our study highlights how different parameters, such as number of features, iterations, step-size and mini-batch size control the learning properties of the solutions. We do this by deriving optimal finite sample bounds, under standard assumptions. The obtained results are corroborated and illustrated by numerical experiments.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes