OCLGMLOct 21, 2019

History-Gradient Aided Batch Size Adaptation for Variance Reduced Algorithms

arXiv:1910.09670v47 citations
Originality Incremental advance
AI Analysis

This work addresses the computational inefficiency of variance-reduced algorithms for practitioners in optimization and reinforcement learning, offering an incremental improvement over existing adaptation methods.

The paper tackles the slow practical performance of variance-reduced algorithms by proposing a batch-size adaptation scheme that uses history stochastic gradients, eliminating backtracking steps. It theoretically reduces overall complexity for SVRG and SARAH/SPIDER in nonconvex optimization and reinforcement learning, with extensive experiments validating effectiveness.

Variance-reduced algorithms, although achieve great theoretical performance, can run slowly in practice due to the periodic gradient estimation with a large batch of data. Batch-size adaptation thus arises as a promising approach to accelerate such algorithms. However, existing schemes either apply prescribed batch-size adaption rule or exploit the information along optimization path via additional backtracking and condition verification steps. In this paper, we propose a novel scheme, which eliminates backtracking line search but still exploits the information along optimization path by adapting the batch size via history stochastic gradients. We further theoretically show that such a scheme substantially reduces the overall complexity for popular variance-reduced algorithms SVRG and SARAH/SPIDER for both conventional nonconvex optimization and reinforcement learning problems. To this end, we develop a new convergence analysis framework to handle the dependence of the batch size on history stochastic gradients. Extensive experiments validate the effectiveness of the proposed batch-size adaptation scheme.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes