Local Convergence of Gradient Descent-Ascent for Training Generative Adversarial Networks
This provides theoretical insights for researchers training GANs, but it is incremental as it builds on prior work with specific assumptions.
The paper analyzes the local convergence of gradient descent-ascent for training GANs with a kernel-based discriminator, identifying phase transitions in convergence behavior and showing how learning rates, regularization, and kernel bandwidth affect the convergence rate.
Generative Adversarial Networks (GANs) are a popular formulation to train generative models for complex high dimensional data. The standard method for training GANs involves a gradient descent-ascent (GDA) procedure on a minimax optimization problem. This procedure is hard to analyze in general due to the nonlinear nature of the dynamics. We study the local dynamics of GDA for training a GAN with a kernel-based discriminator. This convergence analysis is based on a linearization of a non-linear dynamical system that describes the GDA iterations, under an \textit{isolated points model} assumption from [Becker et al. 2022]. Our analysis brings out the effect of the learning rates, regularization, and the bandwidth of the kernel discriminator, on the local convergence rate of GDA. Importantly, we show phase transitions that indicate when the system converges, oscillates, or diverges. We also provide numerical simulations that verify our claims.