OC LGMay 12

Constrained Stochastic Spectral Preconditioning Converges for Nonconvex Objectives

Konstantinos Oikonomidis, Jan Quan, Kimon Antonakopoulos, Antonio Silveti-Falls, Volkan Cevher, Panagiotis Patrinos

arXiv:2605.1185018.1

Predicted impact top 6% in OC · last 90 daysOriginality Incremental advance

AI Analysis

Provides theoretical convergence guarantees for spectral preconditioning methods used in deep learning, addressing a gap in understanding their behavior on nonconvex problems.

This paper develops proximal preconditioned gradient methods extending Muon and Scion optimizers, proving convergence for nonconvex objectives under heavy-tailed noise and achieving faster convergence with variance reduction.

In this work, we develop proximal preconditioned gradient methods with a focus on spectral gradient methods providing a proximal extension to the Muon and Scion optimizers. We introduce a family of stochastic algorithms that can handle a wide variety of convex and nonconvex constraints and study its convergence under heavy-tailed noise, through a novel analysis tailored to the geometry of the proposed methods. We further propose a variance-reduced version, which achieves faster convergence under standard noise assumptions. Finally, we show that the polynomial iterations used in Muon are more accurately captured by a nonlinear preconditioner than by the ideal matrix sign, leading to a convergence analysis that more faithfully reflects practical implementations.

View on arXiv PDF

Similar