LGGTMLJun 16, 2020

Linear Last-iterate Convergence in Constrained Saddle-point Optimization

arXiv:2006.09517v343 citations
Originality Incremental advance
AI Analysis

This work provides theoretical guarantees for optimization algorithms in machine learning, addressing a gap in understanding convergence rates for constrained settings, but it is incremental as it builds on prior analyses.

The paper tackles the problem of understanding last-iterate convergence rates for Optimistic Gradient Descent Ascent (OGDA) and Optimistic Multiplicative Weights Update (OMWU) in constrained saddle-point optimization, showing that OMWU achieves linear convergence with a universal constant learning rate for bilinear games over the simplex under unique equilibrium, and OGDA converges exponentially fast for bilinear games over polytopes without requiring unique equilibrium, with experimental validation.

Optimistic Gradient Descent Ascent (OGDA) and Optimistic Multiplicative Weights Update (OMWU) for saddle-point optimization have received growing attention due to their favorable last-iterate convergence. However, their behaviors for simple bilinear games over the probability simplex are still not fully understood - previous analysis lacks explicit convergence rates, only applies to an exponentially small learning rate, or requires additional assumptions such as the uniqueness of the optimal solution. In this work, we significantly expand the understanding of last-iterate convergence for OGDA and OMWU in the constrained setting. Specifically, for OMWU in bilinear games over the simplex, we show that when the equilibrium is unique, linear last-iterate convergence is achieved with a learning rate whose value is set to a universal constant, improving the result of (Daskalakis & Panageas, 2019b) under the same assumption. We then significantly extend the results to more general objectives and feasible sets for the projected OGDA algorithm, by introducing a sufficient condition under which OGDA exhibits concrete last-iterate convergence rates with a constant learning rate whose value only depends on the smoothness of the objective function. We show that bilinear games over any polytope satisfy this condition and OGDA converges exponentially fast even without the unique equilibrium assumption. Our condition also holds for strongly-convex-strongly-concave functions, recovering the result of (Hsieh et al., 2019). Finally, we provide experimental results to further support our theory.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes