LGOCJun 13, 2022

On the Convergence to a Global Solution of Shuffling-Type Gradient Algorithms

IBM
arXiv:2206.05869v25 citationsh-index: 23
Originality Incremental advance
AI Analysis

This addresses the theoretical gap for a widely used heuristic in machine learning, though it is incremental as it extends prior convex results to non-convex cases.

The paper tackles the convergence of shuffling SGD for non-convex functions in over-parameterized settings, showing it converges to a global solution under relaxed assumptions while maintaining computational complexity.

Stochastic gradient descent (SGD) algorithm is the method of choice in many machine learning tasks thanks to its scalability and efficiency in dealing with large-scale problems. In this paper, we focus on the shuffling version of SGD which matches the mainstream practical heuristics. We show the convergence to a global solution of shuffling SGD for a class of non-convex functions under over-parameterized settings. Our analysis employs more relaxed non-convex assumptions than previous literature. Nevertheless, we maintain the desired computational complexity as shuffling SGD has achieved in the general convex setting.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes