Last-iterate convergence rates for min-max optimization
This addresses a key problem in nonconvex applications like training Generative Adversarial Networks, providing theoretical guarantees for algorithms that previously lacked such convergence proofs, though it is incremental as it builds on prior work in convex-concave settings.
The paper tackles the challenge of proving last-iterate convergence in min-max optimization, showing that Hamiltonian Gradient Descent achieves linear convergence in convex-concave problems under a 'sufficiently bilinear' condition, with similar results for Consensus Optimization in certain parameter settings.
While classic work in convex-concave min-max optimization relies on average-iterate convergence results, the emergence of nonconvex applications such as training Generative Adversarial Networks has led to renewed interest in last-iterate convergence guarantees. Proving last-iterate convergence is challenging because many natural algorithms, such as Simultaneous Gradient Descent/Ascent, provably diverge or cycle even in simple convex-concave min-max settings, and previous work on global last-iterate convergence rates has been limited to the bilinear and convex-strongly concave settings. In this work, we show that the Hamiltonian Gradient Descent (HGD) algorithm achieves linear convergence in a variety of more general settings, including convex-concave problems that satisfy a "sufficiently bilinear" condition. We also prove similar convergence rates for the Consensus Optimization (CO) algorithm of [MNG17] for some parameter settings of CO.