Averaging Stochastic Gradient Descent on Riemannian Manifolds
This provides a method to accelerate optimization on manifolds for problems like PCA, though it appears incremental as it extends averaging techniques to the Riemannian setting.
The paper tackles the problem of slow convergence in stochastic gradient descent on Riemannian manifolds by developing a geometric averaging framework that transforms slowly converging SGD iterates into averaged iterates with a robust O(1/n) convergence rate, demonstrated with applications including streaming k-PCA where it achieves optimal convergence rates.
We consider the minimization of a function defined on a Riemannian manifold $\mathcal{M}$ accessible only through unbiased estimates of its gradients. We develop a geometric framework to transform a sequence of slowly converging iterates generated from stochastic gradient descent (SGD) on $\mathcal{M}$ to an averaged iterate sequence with a robust and fast $O(1/n)$ convergence rate. We then present an application of our framework to geodesically-strongly-convex (and possibly Euclidean non-convex) problems. Finally, we demonstrate how these ideas apply to the case of streaming $k$-PCA, where we show how to accelerate the slow rate of the randomized power method (without requiring knowledge of the eigengap) into a robust algorithm achieving the optimal rate of convergence.