No Representation without Transformation
This work addresses the problem of learning interpretable latent transformations for researchers in representation learning, though it appears to be an incremental extension of existing variational autoencoder frameworks.
The authors extended variational autoencoders to explicitly represent transformations in latent space using hierarchical graphical models, where higher-order objects are inferred jointly with latent representations. They demonstrated that these inferred transformations reflect interpretable properties in observation space and achieved significantly better performance than baselines on a challenging out-of-distribution classification task.
We extend the framework of variational autoencoders to represent transformations explicitly in the latent space. In the family of hierarchical graphical models that emerges, the latent space is populated by higher order objects that are inferred jointly with the latent representations they act on. To explicitly demonstrate the effect of these higher order objects, we show that the inferred latent transformations reflect interpretable properties in the observation space. Furthermore, the model is structured in such a way that in the absence of transformations, we can run inference and obtain generative capabilities comparable with standard variational autoencoders. Finally, utilizing the trained encoder, we outperform the baselines by a wide margin on a challenging out-of-distribution classification task.