LGOCMLSep 30, 2015

Convergence of Stochastic Gradient Descent for PCA

arXiv:1509.09002v288 citations
AI Analysis

This addresses a theoretical challenge in non-convex optimization for machine learning, offering incremental improvements in convergence analysis for PCA.

The paper tackles the problem of analyzing stochastic gradient descent (SGD) for principal component analysis (PCA) in a streaming stochastic setting, providing the first eigengap-free convergence guarantees and, under an eigengap assumption, new guarantees with improved dependence on the eigengap.

We consider the problem of principal component analysis (PCA) in a streaming stochastic setting, where our goal is to find a direction of approximate maximal variance, based on a stream of i.i.d. data points in $\reals^d$. A simple and computationally cheap algorithm for this is stochastic gradient descent (SGD), which incrementally updates its estimate based on each new data point. However, due to the non-convex nature of the problem, analyzing its performance has been a challenge. In particular, existing guarantees rely on a non-trivial eigengap assumption on the covariance matrix, which is intuitively unnecessary. In this paper, we provide (to the best of our knowledge) the first eigengap-free convergence guarantees for SGD in the context of PCA. This also partially resolves an open problem posed in \cite{hardt2014noisy}. Moreover, under an eigengap assumption, we show that the same techniques lead to new SGD convergence guarantees with better dependence on the eigengap.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes