OCLGNADec 31, 2019

Stochastic gradient-free descents

arXiv:1912.13305v53 citations
Originality Incremental advance
AI Analysis

This work addresses optimization challenges in scenarios where gradients are unavailable or noisy, offering theoretical guarantees for stochastic gradient-free methods, though it appears incremental as it extends existing momentum techniques to this setting.

The paper tackles stochastic optimization problems by proposing gradient-free methods and accelerated versions with momentum, achieving sublinear convergence rates of O(1/k) for gradient-free methods and O(1/k^2) for accelerated methods on strongly convex objectives, and convergence to zero expected gradient norm for nonconvex functions.

In this paper we propose stochastic gradient-free methods and accelerated methods with momentum for solving stochastic optimization problems. All these methods rely on stochastic directions rather than stochastic gradients. We analyze the convergence behavior of these methods under the mean-variance framework, and also provide a theoretical analysis about the inclusion of momentum in stochastic settings which reveals that the momentum term we used adds a deviation of order $\mathcal{O}(1/k)$ but controls the variance at the order $\mathcal{O}(1/k)$ for the $k$th iteration. So it is shown that, when employing a decaying stepsize $α_k=\mathcal{O}(1/k)$, the stochastic gradient-free methods can still maintain the sublinear convergence rate $\mathcal{O}(1/k)$ and the accelerated methods with momentum can achieve a convergence rate $\mathcal{O}(1/k^2)$ in probability for the strongly convex objectives with Lipschitz gradients; and all these methods converge to a solution with a zero expected gradient norm when the objective function is nonconvex, twice differentiable and bounded below.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes