Stability and Generalization of Stochastic Gradient Methods for Minimax Problems
This addresses the generalization gap for minimax problems like GANs and robust estimation, providing theoretical guarantees for practitioners, though it is incremental as it builds on existing stability frameworks.
The paper tackles the lack of generalization analysis for stochastic gradient methods in minimax problems, establishing quantitative connections between algorithmic stability and generalization measures, and showing optimal generalization bounds for convex-concave settings and bounds for nonconvex cases.
Many machine learning problems can be formulated as minimax problems such as Generative Adversarial Networks (GANs), AUC maximization and robust estimation, to mention but a few. A substantial amount of studies are devoted to studying the convergence behavior of their stochastic gradient-type algorithms. In contrast, there is relatively little work on their generalization, i.e., how the learning models built from training examples would behave on test examples. In this paper, we provide a comprehensive generalization analysis of stochastic gradient methods for minimax problems under both convex-concave and nonconvex-nonconcave cases through the lens of algorithmic stability. We establish a quantitative connection between stability and several generalization measures both in expectation and with high probability. For the convex-concave setting, our stability analysis shows that stochastic gradient descent ascent attains optimal generalization bounds for both smooth and nonsmooth minimax problems. We also establish generalization bounds for both weakly-convex-weakly-concave and gradient-dominated problems.