DeshuffleGAN: A Self-Supervised GAN to Improve Structure Learning
This work addresses the challenge of generating realistic images for applications in computer vision and image synthesis, but it is incremental as it builds on existing GAN frameworks with a self-supervised extension.
The authors tackled the problem of improving GAN performance in terms of realism and similarity to the original data distribution by enhancing spatial structure learning, resulting in consistent performance improvements in generated images compared to baseline methods over two datasets.
Generative Adversarial Networks (GANs) triggered an increased interest in problem of image generation due to their improved output image quality and versatility for expansion towards new methods. Numerous GAN-based works attempt to improve generation by architectural and loss-based extensions. We argue that one of the crucial points to improve the GAN performance in terms of realism and similarity to the original data distribution is to be able to provide the model with a capability to learn the spatial structure in data. To that end, we propose the DeshuffleGAN to enhance the learning of the discriminator and the generator, via a self-supervision approach. Specifically, we introduce a deshuffling task that solves a puzzle of randomly shuffled image tiles, which in turn helps the DeshuffleGAN learn to increase its expressive capacity for spatial structure and realistic appearance. We provide experimental evidence for the performance improvement in generated images, compared to the baseline methods, which is consistently observed over two different datasets.