LG OC MLJul 11, 2025

Revisiting Convergence: Shuffling Complexity Beyond Lipschitz Smoothness

arXiv:2507.08913v14.11 citationsh-index: 3ICML

Originality Incremental advance

AI Analysis

This work addresses a gap in theoretical analysis for practitioners using shuffling-type gradient methods in machine learning, though it is incremental as it extends existing convergence results to broader conditions.

The paper tackled the problem of convergence guarantees for shuffling-type gradient methods without assuming Lipschitz smoothness, which is often unmet in machine learning models, and achieved convergence rates matching the best-known under weaker assumptions for nonconvex, strongly convex, and non-strongly convex cases.

Shuffling-type gradient methods are favored in practice for their simplicity and rapid empirical performance. Despite extensive development of convergence guarantees under various assumptions in recent years, most require the Lipschitz smoothness condition, which is often not met in common machine learning models. We highlight this issue with specific counterexamples. To address this gap, we revisit the convergence rates of shuffling-type gradient methods without assuming Lipschitz smoothness. Using our stepsize strategy, the shuffling-type gradient algorithm not only converges under weaker assumptions but also match the current best-known convergence rates, thereby broadening its applicability. We prove the convergence rates for nonconvex, strongly convex, and non-strongly convex cases, each under both random reshuffling and arbitrary shuffling schemes, under a general bounded variance condition. Numerical experiments further validate the performance of our shuffling-type gradient algorithm, underscoring its practical efficacy.

View on arXiv PDF

Similar