Generating new pictures in complex datasets with a simple neural network
This is an incremental improvement for image generation in AI, focusing on specific datasets with limited latent space.
The paper tackles the problem of generating image perturbations in complex datasets like CIFAR-10 using a variational auto-encoder with only two latent dimensions per class, achieving good generation but at the cost of not reconstructing all training images well and requiring an additional classifier for weighting.
We introduce a version of a variational auto-encoder (VAE), which can generate good perturbations of images, when trained on a complex dataset (in our experiments, CIFAR-10). The net is using only two latent generative dimensions per class, with uni-modal probability density. The price one has to pay for good generation is that not all training images are well reconstructed. An additional classifier is required to determine which training image is well reconstructed and generally the weights of training images. Only training images which are well reconstructed, can be perturbed. For good perturbations, we use the tentative empirical drifts of well reconstructed images. The construct is not predictive in the usual statistical sense.