CV AIJun 28, 2021

Are conditional GANs explicitly conditional?

Houssem eddine Boulahbal, Adrian Voicila, Andrew Comport

arXiv:2106.15011v32.61 citations

Originality Highly original

AI Analysis

This addresses a fundamental limitation in conditional generative modeling for researchers and practitioners in computer vision, offering a novel approach to enhance model reliability and output quality.

The paper tackles the problem that conditional GANs (cGANs) do not inherently learn conditionality between inputs, and proposes a new method called a contrario cGAN that explicitly models conditionality, leading to significant performance improvements across various applications such as semantic image synthesis and image segmentation.

This paper proposes two important contributions for conditional Generative Adversarial Networks (cGANs) to improve the wide variety of applications that exploit this architecture. The first main contribution is an analysis of cGANs to show that they are not explicitly conditional. In particular, it will be shown that the discriminator and subsequently the cGAN does not automatically learn the conditionality between inputs. The second contribution is a new method, called a contrario cGAN, that explicitly models conditionality for both parts of the adversarial architecture via a novel a contrario loss that involves training the discriminator to learn unconditional (adverse) examples. This leads to a novel type of data augmentation approach for GANs (a contrario learning) which allows to restrict the search space of the generator to conditional outputs using adverse examples. Extensive experimentation is carried out to evaluate the conditionality of the discriminator by proposing a probability distribution analysis. Comparisons with the cGAN architecture for different applications show significant improvements in performance on well known datasets including, semantic image synthesis, image segmentation, monocular depth prediction and "single label"-to-image using different metrics including Fréchet Inception Distance (FID), mean Intersection over Union (mIoU), Root Mean Square Error log (RMSE log) and Number of statistically-Different Bins (NDB).

View on arXiv PDF

Similar