CV AI LG MLMay 12, 2023

Spider GAN: Leveraging Friendly Neighbors to Accelerate GAN Training

Siddarth Asokan, Chandra Sekhar Seelamantula

arXiv:2305.07613v13.95 citationsHas Code

Originality Incremental advance

AI Analysis

This addresses faster and more stable GAN training for image generation, though it appears incremental as an enhancement to existing GAN architectures.

The paper tackles the challenge of stable GAN training by proposing Spider GAN, which uses images as inputs instead of noise to leverage their structure, resulting in state-of-the-art FID values with one-fifth of the training iterations on high-resolution small datasets.

Training Generative adversarial networks (GANs) stably is a challenging task. The generator in GANs transform noise vectors, typically Gaussian distributed, into realistic data such as images. In this paper, we propose a novel approach for training GANs with images as inputs, but without enforcing any pairwise constraints. The intuition is that images are more structured than noise, which the generator can leverage to learn a more robust transformation. The process can be made efficient by identifying closely related datasets, or a ``friendly neighborhood'' of the target distribution, inspiring the moniker, Spider GAN. To define friendly neighborhoods leveraging proximity between datasets, we propose a new measure called the signed inception distance (SID), inspired by the polyharmonic kernel. We show that the Spider GAN formulation results in faster convergence, as the generator can discover correspondence even between seemingly unrelated datasets, for instance, between Tiny-ImageNet and CelebA faces. Further, we demonstrate cascading Spider GAN, where the output distribution from a pre-trained GAN generator is used as the input to the subsequent network. Effectively, transporting one distribution to another in a cascaded fashion until the target is learnt -- a new flavor of transfer learning. We demonstrate the efficacy of the Spider approach on DCGAN, conditional GAN, PGGAN, StyleGAN2 and StyleGAN3. The proposed approach achieves state-of-the-art Frechet inception distance (FID) values, with one-fifth of the training iterations, in comparison to their baseline counterparts on high-resolution small datasets such as MetFaces, Ukiyo-E Faces and AFHQ-Cats.

View on arXiv PDF Code

Similar