Generation of data on discontinuous manifolds via continuous stochastic non-invertible networks
This work addresses a specific challenge in generative modeling for discontinuous data, offering a novel approach but likely incremental as it builds on existing encoder and network methods.
The authors tackled the problem of generating discontinuous distributions, which is difficult for existing generative models like VAEs and GANs, by using a pre-trained contrastive encoder to create a clustered latent space and training dedicated networks for each cluster; they demonstrated accurate reconstruction and generation on synthetic 2D distributions.
The generation of discontinuous distributions is a difficult task for most known frameworks such as generative autoencoders and generative adversarial networks. Generative non-invertible models are unable to accurately generate such distributions, require long training and often are subject to mode collapse. Variational autoencoders (VAEs), which are based on the idea of keeping the latent space to be Gaussian for the sake of a simple sampling, allow an accurate reconstruction, while they experience significant limitations at generation task. In this work, instead of trying to keep the latent space to be Gaussian, we use a pre-trained contrastive encoder to obtain a clustered latent space. Then, for each cluster, representing a unimodal submanifold, we train a dedicated low complexity network to generate this submanifold from the Gaussian distribution. The proposed framework is based on the information-theoretic formulation of mutual information maximization between the input data and latent space representation. We derive a link between the cost functions and the information-theoretic formulation. We apply our approach to synthetic 2D distributions to demonstrate both reconstruction and generation of discontinuous distributions using continuous stochastic networks.