LGOCMLDec 10, 2021

Faster Single-loop Algorithms for Minimax Optimization without Strong Concavity

arXiv:2112.05604v182 citations
Originality Incremental advance
AI Analysis

It addresses convergence issues in practical applications like GANs and adversarial training, offering incremental improvements over existing single-loop methods.

This paper tackles the problem of slow convergence in single-loop algorithms for nonconvex minimax optimization by analyzing alternating GDA and smoothed GDA under the Polyak-Lojasiewicz condition, achieving improved iteration complexities such as O(κε^{-2}) for smoothed GDA, which outperforms vanilla GDA.

Gradient descent ascent (GDA), the simplest single-loop algorithm for nonconvex minimax optimization, is widely used in practical applications such as generative adversarial networks (GANs) and adversarial training. Albeit its desirable simplicity, recent work shows inferior convergence rates of GDA in theory even assuming strong concavity of the objective on one side. This paper establishes new convergence results for two alternative single-loop algorithms -- alternating GDA and smoothed GDA -- under the mild assumption that the objective satisfies the Polyak-Lojasiewicz (PL) condition about one variable. We prove that, to find an $ε$-stationary point, (i) alternating GDA and its stochastic variant (without mini batch) respectively require $O(κ^{2} ε^{-2})$ and $O(κ^{4} ε^{-4})$ iterations, while (ii) smoothed GDA and its stochastic variant (without mini batch) respectively require $O(κε^{-2})$ and $O(κ^{2} ε^{-4})$ iterations. The latter greatly improves over the vanilla GDA and gives the hitherto best known complexity results among single-loop algorithms under similar settings. We further showcase the empirical efficiency of these algorithms in training GANs and robust nonlinear regression.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes