OCLGFeb 27, 2021

On the Initialization for Convex-Concave Min-max Problems

arXiv:2103.00284v24 citations
AI Analysis

This addresses a fundamental limitation in optimization for machine learning practitioners, offering incremental improvements in algorithm design for min-max problems.

The paper tackles the issue that convergence rates for convex-concave min-max problems typically ignore initialization, showing that strict-convexity-strict-concavity enables initialization-dependent rates and proposing algorithms, including a parameter-free one, that achieve improved asymptotic and non-asymptotic rates, with experiments verifying effectiveness.

Convex-concave min-max problems are ubiquitous in machine learning, and people usually utilize first-order methods (e.g., gradient descent ascent) to find the optimal solution. One feature which separates convex-concave min-max problems from convex minimization problems is that the best known convergence rates for min-max problems have an explicit dependence on the size of the domain, rather than on the distance between initial point and the optimal solution. This means that the convergence speed does not have any improvement even if the algorithm starts from the optimal solution, and hence, is oblivious to the initialization. Here, we show that strict-convexity-strict-concavity is sufficient to get the convergence rate to depend on the initialization. We also show how different algorithms can asymptotically achieve initialization-dependent convergence rates on this class of functions. Furthermore, we show that the so-called "parameter-free" algorithms allow to achieve improved initialization-dependent asymptotic rates without any learning rate to tune. In addition, we utilize this particular parameter-free algorithm as a subroutine to design a new algorithm, which achieves a novel non-asymptotic fast rate for strictly-convex-strictly-concave min-max problems with a growth condition and H{ö}lder continuous solution mapping. Experiments are conducted to verify our theoretical findings and demonstrate the effectiveness of the proposed algorithms.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes