Generative models with kernel distance in data space
This addresses the problem of unstable training and mode collapse in generative models for researchers and practitioners, though it appears incremental as it builds on existing architectures.
The paper tackles the weaknesses of autoencoder and GAN-based generative models, such as blurry images and training instability, by proposing the LCW generator that uses kernel distance instead of a discriminator, achieving competitive FID values.
Generative models dealing with modeling a~joint data distribution are generally either autoencoder or GAN based. Both have their pros and cons, generating blurry images or being unstable in training or prone to mode collapse phenomenon, respectively. The objective of this paper is to construct a~model situated between above architectures, one that does not inherit their main weaknesses. The proposed LCW generator (Latent Cramer-Wold generator) resembles a classical GAN in transforming Gaussian noise into data space. What is of utmost importance, instead of a~discriminator, LCW generator uses kernel distance. No adversarial training is utilized, hence the name generator. It is trained in two phases. First, an autoencoder based architecture, using kernel measures, is built to model a manifold of data. We propose a Latent Trick mapping a Gaussian to latent in order to get the final model. This results in very competitive FID values.