LGOCMLNov 5, 2014

Global Convergence of Stochastic Gradient Descent for Some Non-convex Matrix Problems

arXiv:1411.1134v3154 citations
Originality Highly original
AI Analysis

This provides a theoretical guarantee for SGD in non-convex optimization, which is incremental but important for applications like matrix completion and subspace tracking.

The paper tackles the problem of global convergence for stochastic gradient descent (SGD) on non-convex matrix problems by proposing a step size scheme, proving that it converges globally from a random starting point within O(ε⁻¹ n log n) steps with constant probability for constant-rank problems.

Stochastic gradient descent (SGD) on a low-rank factorization is commonly employed to speed up matrix problems including matrix completion, subspace tracking, and SDP relaxation. In this paper, we exhibit a step size scheme for SGD on a low-rank least-squares problem, and we prove that, under broad sampling conditions, our method converges globally from a random starting point within $O(ε^{-1} n \log n)$ steps with constant probability for constant-rank problems. Our modification of SGD relates it to stochastic power iteration. We also show experiments to illustrate the runtime and convergence of the algorithm.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes