LGOCMLAug 13, 2020

Variance Regularization for Accelerating Stochastic Optimization

arXiv:2008.05969v1
Originality Incremental advance
AI Analysis

This work addresses a general issue in stochastic optimization for machine learning practitioners, offering an incremental improvement to existing first-order methods.

The paper tackles the problem of random error accumulation in stochastic gradient-based optimization by proposing variance regularization of learning rates using mini-batch statistics, which accelerates convergence and stabilizes the process, as demonstrated empirically.

While nowadays most gradient-based optimization methods focus on exploring the high-dimensional geometric features, the random error accumulated in a stochastic version of any algorithm implementation has not been stressed yet. In this work, we propose a universal principle which reduces the random error accumulation by exploiting statistic information hidden in mini-batch gradients. This is achieved by regularizing the learning-rate according to mini-batch variances. Due to the complementarity of our perspective, this regularization could provide a further improvement for stochastic implementation of generic 1st order approaches. With empirical results, we demonstrated the variance regularization could speed up the convergence as well as stabilize the stochastic optimization.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes