A KL-based Analysis Framework with Applications to Non-Descent Optimization Methods
This provides theoretical guarantees for distributed and stochastic optimization algorithms used in machine learning, though it appears incremental as an extension of existing analysis techniques.
The authors developed a novel analysis framework based on the Kurdyka-Lojasiewicz property to study non-descent optimization methods in nonconvex settings, enabling the first proof of convergence rates for decentralized gradient methods and federated averaging under mild assumptions.
We propose a novel analysis framework for non-descent-type optimization methodologies in nonconvex scenarios based on the Kurdyka-Lojasiewicz property. Our framework allows covering a broad class of algorithms, including those commonly employed in stochastic and distributed optimization. Specifically, it enables the analysis of first-order methods that lack a sufficient descent property and do not require access to full (deterministic) gradient information. We leverage this framework to establish, for the first time, iterate convergence and the corresponding rates for the decentralized gradient method and federated averaging under mild assumptions. Furthermore, based on the new analysis techniques, we show the convergence of the random reshuffling and stochastic gradient descent method without necessitating typical a priori bounded iterates assumptions.