The Variational Homoencoder: Learning to learn high capacity generative models from few examples
This addresses the challenge of learning high-capacity generative models from few examples, which is incremental as it modifies the Variational Autoencoder for improved latent variable utilization.
The paper tackled the problem of existing learning techniques failing to effectively use latent variables in hierarchical Bayesian methods with powerful neural networks like PixelCNN, and introduced the Variational Homoencoder (VHE) to better utilize latent variables, resulting in outperforming all existing models on test set likelihood for Omniglot and strong performance on one-shot tasks.
Hierarchical Bayesian methods can unify many related tasks (e.g. k-shot classification, conditional and unconditional generation) as inference within a single generative model. However, when this generative model is expressed as a powerful neural network such as a PixelCNN, we show that existing learning techniques typically fail to effectively use latent variables. To address this, we develop a modification of the Variational Autoencoder in which encoded observations are decoded to new elements from the same class. This technique, which we call a Variational Homoencoder (VHE), produces a hierarchical latent variable model which better utilises latent variables. We use the VHE framework to learn a hierarchical PixelCNN on the Omniglot dataset, which outperforms all existing models on test set likelihood and achieves strong performance on one-shot generation and classification tasks. We additionally validate the VHE on natural images from the YouTube Faces database. Finally, we develop extensions of the model that apply to richer dataset structures such as factorial and hierarchical categories.