OC LGOct 23, 2020

Escape saddle points faster on manifolds via perturbed Riemannian stochastic recursive gradient

Andi Han, Junbin Gao

arXiv:2010.12191v25.93 citations

Originality Incremental advance

AI Analysis

This work addresses optimization challenges for machine learning and AI applications on manifolds, offering improved convergence guarantees over existing methods, though it is incremental in nature.

The paper tackles the problem of escaping saddle points efficiently on Riemannian manifolds by proposing a perturbed Riemannian stochastic recursive gradient method, achieving a complexity of $\widetilde{\mathcal{O}}ig( rac{ \sqrt{n}}{ε^2} + rac{\sqrt{n} }{δ^4} + rac{n}{δ^3}ig)$ stochastic gradient queries to find a $(ε, δ)$-second-order critical point in finite-sum settings.

In this paper, we propose a variant of Riemannian stochastic recursive gradient method that can achieve second-order convergence guarantee and escape saddle points using simple perturbation. The idea is to perturb the iterates when gradient is small and carry out stochastic recursive gradient updates over tangent space. This avoids the complication of exploiting Riemannian geometry. We show that under finite-sum setting, our algorithm requires $\widetilde{\mathcal{O}}\big( \frac{ \sqrt{n}}{ε^2} + \frac{\sqrt{n} }{δ^4} + \frac{n}{δ^3}\big)$ stochastic gradient queries to find a $(ε, δ)$-second-order critical point. This strictly improves the complexity of perturbed Riemannian gradient descent and is superior to perturbed Riemannian accelerated gradient descent under large-sample settings. We also provide a complexity of $\widetilde{\mathcal{O}} \big( \frac{1}{ε^3} + \frac{1}{δ^3 ε^2} + \frac{1}{δ^4 ε} \big)$ for online optimization, which is novel on Riemannian manifold in terms of second-order convergence using only first-order information.

View on arXiv PDF

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes