ML LG OCJul 9, 2019

Ordered SGD: A New Stochastic Optimization Framework for Empirical Risk Minimization

arXiv:1907.04371v522.378 citationsHas Code

Originality Incremental advance

AI Analysis

This work addresses optimization efficiency for machine learning practitioners by offering an incremental improvement over existing stochastic gradient descent methods.

The authors tackled empirical risk minimization in machine learning by proposing a new stochastic optimization framework that biases gradient estimators toward observations with higher current losses, resulting in consistent improvements in test errors across SVM, logistic regression, and deep learning models compared to standard mini-batch SGD.

We propose a new stochastic optimization framework for empirical risk minimization problems such as those that arise in machine learning. The traditional approaches, such as (mini-batch) stochastic gradient descent (SGD), utilize an unbiased gradient estimator of the empirical average loss. In contrast, we develop a computationally efficient method to construct a gradient estimator that is purposely biased toward those observations with higher current losses. On the theory side, we show that the proposed method minimizes a new ordered modification of the empirical average loss, and is guaranteed to converge at a sublinear rate to a global optimum for convex loss and to a critical point for weakly convex (non-convex) loss. Furthermore, we prove a new generalization bound for the proposed algorithm. On the empirical side, the numerical experiments show that our proposed method consistently improves the test errors compared with the standard mini-batch SGD in various models including SVM, logistic regression, and deep learning problems.

View on arXiv PDF Code

Similar