CV IVMar 6, 2019

DepthwiseGANs: Fast Training Generative Adversarial Networks for Realistic Image Synthesis

Mkhuseli Ngxande, Jules-Raymond Tapamo, Michael Burke

arXiv:1903.02225v14.78 citations

Originality Incremental advance

AI Analysis

This work addresses the computational inefficiency of GANs for researchers and practitioners in computer vision, but it is incremental as it builds on existing methods with a focus on training speed.

The paper tackled the problem of slow training in Generative Adversarial Networks (GANs) by using depthwise separable convolutions, resulting in faster training times while maintaining realistic image synthesis, as shown by comparisons with StarGan using Fréchet Inception Distance (FID) for evaluation.

Recent work has shown significant progress in the direction of synthetic data generation using Generative Adversarial Networks (GANs). GANs have been applied in many fields of computer vision including text-to-image conversion, domain transfer, super-resolution, and image-to-video applications. In computer vision, traditional GANs are based on deep convolutional neural networks. However, deep convolutional neural networks can require extensive computational resources because they are based on multiple operations performed by convolutional layers, which can consist of millions of trainable parameters. Training a GAN model can be difficult and it takes a significant amount of time to reach an equilibrium point. In this paper, we investigate the use of depthwise separable convolutions to reduce training time while maintaining data generation performance. Our results show that a DepthwiseGAN architecture can generate realistic images in shorter training periods when compared to a StarGan architecture, but that model capacity still plays a significant role in generative modelling. In addition, we show that depthwise separable convolutions perform best when only applied to the generator. For quality evaluation of generated images, we use the Fréchet Inception Distance (FID), which compares the similarity between the generated image distribution and that of the training dataset.

View on arXiv PDF

Similar