OCLGMLJul 11, 2025

Stochastic Approximation with Block Coordinate Optimal Stepsizes

arXiv:2507.08963v12 citationsh-index: 1
Originality Incremental advance
AI Analysis

This work addresses optimization efficiency for machine learning practitioners by offering a more resource-efficient alternative to Adam, though it is incremental as it builds on existing adaptive methods.

The paper tackles the problem of designing adaptive stepsize rules for stochastic approximation with block-coordinate updates, aiming to minimize the expected distance to an optimal point. It proposes a new method that achieves comparable performance to Adam with less memory and fewer hyper-parameters, and proves almost sure convergence to a small neighborhood of the optimum under non-convex and non-smooth conditions.

We consider stochastic approximation with block-coordinate stepsizes and propose adaptive stepsize rules that aim to minimize the expected distance from the next iterate to an optimal point. These stepsize rules employ online estimates of the second moment of the search direction along each block coordinate. The popular Adam algorithm can be interpreted as a particular heuristic for such estimation. By leveraging a simple conditional estimator, we derive a new method that obtains comparable performance as Adam but requires less memory and fewer hyper-parameters. We prove that this family of methods converges almost surely to a small neighborhood of the optimal point, and the radius of the neighborhood depends on the bias and variance of the second-moment estimator. Our analysis relies on a simple aiming condition that assumes neither convexity nor smoothness, thus has broad applicability.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes