LGMLJun 28, 2020

A Multilevel Approach to Training

arXiv:2006.15602v14 citations
Originality Incremental advance
AI Analysis

This addresses efficiency in training large-scale machine learning models, though it appears incremental as it builds on existing multilevel techniques from partial differential equations.

The authors tackled the problem of training machine learning models efficiently by proposing a multilevel training method that reduces gradient variance through surrogate models with fewer samples, demonstrating improved convergence in logistic regression applications compared to subsampled Newton's and variance reduction methods.

We propose a novel training method based on nonlinear multilevel minimization techniques, commonly used for solving discretized large scale partial differential equations. Our multilevel training method constructs a multilevel hierarchy by reducing the number of samples. The training of the original model is then enhanced by internally training surrogate models constructed with fewer samples. We construct the surrogate models using first-order consistency approach. This gives rise to surrogate models, whose gradients are stochastic estimators of the full gradient, but with reduced variance compared to standard stochastic gradient estimators. We illustrate the convergence behavior of the proposed multilevel method to machine learning applications based on logistic regression. A comparison with subsampled Newton's and variance reduction methods demonstrate the efficiency of our multilevel method.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes