LGOCMLJan 30, 2025

Faster Convergence of Riemannian Stochastic Gradient Descent with Increasing Batch Size

arXiv:2501.18164v63 citationsh-index: 4
Originality Incremental advance
AI Analysis

This provides a theoretical improvement for optimization on Riemannian manifolds, which is incremental but could benefit applications like principal component analysis and low-rank matrix completion.

The paper tackled the convergence of Riemannian stochastic gradient descent (RSGD) by showing that using an increasing batch size leads to faster convergence than a constant batch size, improving the rate from O(T^{-1}+C) to O(T^{-1}) and reducing stochastic first-order oracle complexity.

We theoretically analyzed the convergence behavior of Riemannian stochastic gradient descent (RSGD) and found that using an increasing batch size leads to faster convergence than using a constant batch size, not only with a constant learning rate but also with a decaying learning rate, such as cosine annealing decay and polynomial decay. The convergence rate improves from $O(T^{-1}+C)$ with a constant batch size to $O(T^{-1})$ with an increasing batch size, where $T$ denotes the total number of iterations and $C$ is a constant. Using principal component analysis and low-rank matrix completion, we investigated, both theoretically and numerically, how an increasing batch size affects computational time as quantified by stochastic first-order oracle (SFO) complexity. An increasing batch size was found to reduce the SFO complexity of RSGD. Furthermore, an increasing batch size was found to offer the advantages of both small and large constant batch sizes.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes