OCLGMLMay 28, 2022

Generalization Bounds of Nonconvex-(Strongly)-Concave Stochastic Minimax Optimization

ETH Zurich
arXiv:2205.14278v28 citationsh-index: 29
Originality Incremental advance
AI Analysis

It addresses generalization in minimax optimization for machine learning, providing theoretical guarantees that are incremental but specific to this domain.

This paper investigates generalization bounds for nonconvex-(strongly)-concave stochastic minimax optimization, establishing algorithm-agnostic bounds with sample complexities of $ ilde{\mathcal{O}}(d\kappa^2\varepsilon^{-2})$ and $ ilde{\mathcal{O}}(d\varepsilon^{-4})$ for different settings, and algorithm-dependent bounds for SGDA and other algorithms via a novel stability notion.

This paper takes an initial step to systematically investigate the generalization bounds of algorithms for solving nonconvex-(strongly)-concave (NC-SC/NC-C) stochastic minimax optimization measured by the stationarity of primal functions. We first establish algorithm-agnostic generalization bounds via uniform convergence between the empirical minimax problem and the population minimax problem. The sample complexities for achieving $ε$-generalization are $\tilde{\mathcal{O}}(dκ^2ε^{-2})$ and $\tilde{\mathcal{O}}(dε^{-4})$ for NC-SC and NC-C settings, respectively, where $d$ is the dimension and $κ$ is the condition number. We further study the algorithm-dependent generalization bounds via stability arguments of algorithms. In particular, we introduce a novel stability notion for minimax problems and build a connection between generalization bounds and the stability notion. As a result, we establish algorithm-dependent generalization bounds for stochastic gradient descent ascent (SGDA) algorithm and the more general sampling-determined algorithms.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes