LGOCMLJun 5, 2019

On the Convergence of SARAH and Beyond

arXiv:1906.02351v232 citations
Originality Incremental advance
AI Analysis

This work addresses optimization efficiency for machine learning practitioners by providing incremental improvements to variance reduction methods.

The authors tackled the problem of optimizing summation-based loss functions by proposing a unifying algorithm called LoopLess SARAH (L2S), which extends the SARAH method and achieves complexities such as O((n+κ) ln(1/ε)) for strongly convex problems and O(n+√n/ε) for convex and nonconvex problems, with numerical tests showing better generalization than SARAH on neural networks.

The main theme of this work is a unifying algorithm, \textbf{L}oop\textbf{L}ess \textbf{S}ARAH (L2S) for problems formulated as summation of $n$ individual loss functions. L2S broadens a recently developed variance reduction method known as SARAH. To find an $ε$-accurate solution, L2S enjoys a complexity of ${\cal O}\big( (n+κ) \ln (1/ε)\big)$ for strongly convex problems. For convex problems, when adopting an $n$-dependent step size, the complexity of L2S is ${\cal O}(n+ \sqrt{n}/ε)$; while for more frequently adopted $n$-independent step size, the complexity is ${\cal O}(n+ n/ε)$. Distinct from SARAH, our theoretical findings support an $n$-independent step size in convex problems without extra assumptions. For nonconvex problems, the complexity of L2S is ${\cal O}(n+ \sqrt{n}/ε)$. Our numerical tests on neural networks suggest that L2S can have better generalization properties than SARAH. Along with L2S, our side results include the linear convergence of the last iteration for SARAH in strongly convex problems.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes