OC LGMar 5, 2024

Shuffling Momentum Gradient Algorithm for Convex Optimization

Trang H. Tran, Quoc Tran-Dinh, Lam M. Nguyen

arXiv:2403.03180v15.62 citationsh-index: 28Vietnam J Math

Originality Incremental advance

AI Analysis

This work provides theoretical guarantees for shuffling momentum methods in convex optimization, which is incremental but addresses a gap in the literature for strongly convex problems.

The paper tackles the analysis of shuffling momentum gradient methods for finite-sum convex and strongly convex optimization problems, achieving a convergence rate of O(1/nT^2) for the strongly convex setting, which matches the best rates of existing shuffling stochastic gradient algorithms.

The Stochastic Gradient Descent method (SGD) and its stochastic variants have become methods of choice for solving finite-sum optimization problems arising from machine learning and data science thanks to their ability to handle large-scale applications and big datasets. In the last decades, researchers have made substantial effort to study the theoretical performance of SGD and its shuffling variants. However, only limited work has investigated its shuffling momentum variants, including shuffling heavy-ball momentum schemes for non-convex problems and Nesterov's momentum for convex settings. In this work, we extend the analysis of the shuffling momentum gradient method developed in [Tran et al (2021)] to both finite-sum convex and strongly convex optimization problems. We provide the first analysis of shuffling momentum-based methods for the strongly convex setting, attaining a convergence rate of $O(1/nT^2)$, where $n$ is the number of samples and $T$ is the number of training epochs. Our analysis is a state-of-the-art, matching the best rates of existing shuffling stochastic gradient algorithms in the literature.

View on arXiv PDF

Similar