LGOCMLMay 30, 2023

On Convergence of Incremental Gradient for Non-Convex Smooth Functions

arXiv:2305.19259v48 citations
Originality Incremental advance
AI Analysis

This work addresses a theoretical gap in optimization for machine learning practitioners, providing enhanced convergence guarantees for widely used algorithms, though it appears incremental in nature.

The paper tackles the theoretical convergence properties of incremental gradient and shuffle SGD algorithms for non-convex smooth functions, showing improved convergence guarantees that reduce the optimization term from O(n/ε) to O(1/ε) to reach accuracy ε, where n is the training set size.

In machine learning and neural network optimization, algorithms like incremental gradient, and shuffle SGD are popular due to minimizing the number of cache misses and good practical convergence behavior. However, their optimization properties in theory, especially for non-convex smooth functions, remain incompletely explored. This paper delves into the convergence properties of SGD algorithms with arbitrary data ordering, within a broad framework for non-convex smooth functions. Our findings show enhanced convergence guarantees for incremental gradient and single shuffle SGD. Particularly if $n$ is the training set size, we improve $n$ times the optimization term of convergence guarantee to reach accuracy $\varepsilon$ from $O(n / \varepsilon)$ to $O(1 / \varepsilon)$.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes