InfoCatVAE: Representation Learning with Categorical Variational Autoencoders
This work addresses representation learning for machine learning researchers, presenting an incremental improvement over existing VAE and InfoGAN methods.
The paper tackles unsupervised disentangled representation learning by extending variational autoencoders with multimodal priors and inference networks, resulting in an improved model that connects the ELBO objective with soft clustering and adapts InfoGANs to maximize mutual information between categorical codes and generated inputs.
This paper describes InfoCatVAE, an extension of the variational autoencoder that enables unsupervised disentangled representation learning. InfoCatVAE uses multimodal distributions for the prior and the inference network and then maximizes the evidence lower bound objective (ELBO). We connect the new ELBO derived for our model with a natural soft clustering objective which explains the robustness of our approach. We then adapt the InfoGANs method to our setting in order to maximize the mutual information between the categorical code and the generated inputs and obtain an improved model.