LGOCMLDec 31, 2019

A Dynamic Sampling Adaptive-SGD Method for Machine Learning

arXiv:1912.13357v25 citations
Originality Incremental advance
AI Analysis

This work addresses the need for automated hyperparameter tuning in machine learning optimization, offering an incremental improvement over existing methods like SGD and ADAM.

The authors tackled the problem of tuning learning rates in stochastic optimization by proposing a method that adaptively controls batch size and step size, eliminating manual tuning and achieving favorable performance compared to fine-tuned SGD and ADAM in training logistic regression and DNNs.

We propose a stochastic optimization method for minimizing loss functions, expressed as an expected value, that adaptively controls the batch size used in the computation of gradient approximations and the step size used to move along such directions, eliminating the need for the user to tune the learning rate. The proposed method exploits local curvature information and ensures that search directions are descent directions with high probability using an acute-angle test and can be used as a method that has a global linear rate of convergence on self-concordant functions with high probability. Numerical experiments show that this method is able to choose the best learning rates and compares favorably to fine-tuned SGD for training logistic regression and DNNs. We also propose an adaptive version of ADAM that eliminates the need to tune the base learning rate and compares favorably to fine-tuned ADAM on training DNNs. In our DNN experiments, we rarely encountered negative curvature at the current point along the step direction in DNNs.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes