LGCVOCMar 28, 2022

Conjugate Gradient Method for Generative Adversarial Networks

arXiv:2203.14495v31 citationsh-index: 25Has Code
Originality Incremental advance
AI Analysis

This work addresses training instability in GANs, which is a problem for researchers and practitioners in generative modeling, though it is incremental as it applies an existing optimization technique to a known bottleneck.

The paper tackles the challenge of training generative adversarial networks (GANs) by using the conjugate gradient method to solve for local Nash equilibrium, resulting in improved performance over SGD and Adam in terms of Frechet inception distance scores.

One of the training strategies of generative models is to minimize the Jensen--Shannon divergence between the model distribution and the data distribution. Since data distribution is unknown, generative adversarial networks (GANs) formulate this problem as a game between two models, a generator and a discriminator. The training can be formulated in the context of game theory and the local Nash equilibrium (LNE). It does not seem feasible to derive guarantees of stability or optimality for the existing methods. This optimization problem is far more challenging than the single objective setting. Here, we use the conjugate gradient method to reliably and efficiently solve the LNE problem in GANs. We give a proof and convergence analysis under mild assumptions showing that the proposed method converges to a LNE with three different learning rate update rules, including a constant learning rate. Finally, we demonstrate that the proposed method outperforms stochastic gradient descent (SGD) and momentum SGD in terms of best Frechet inception distance (FID) score and outperforms Adam on average. The code is available at \url{https://github.com/Hiroki11x/ConjugateGradient_GAN}.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes