Manifold Relevance Determination
This work addresses the challenge of modeling complex, high-dimensional data with multiple views for applications like image generation and pose prediction, representing an incremental improvement over previous methods by introducing soft sharing and Bayesian techniques.
The paper tackles the problem of learning efficient latent representations from multiple data views by introducing a fully Bayesian latent variable model that factorizes the latent space into shared and private components, allowing for a softly shared latent space and automatic dimensionality estimation. It demonstrates the model's capability by generating novel images from unprocessed high-dimensional data and predicting human pose in ambiguous settings with principled disambiguation.
In this paper we present a fully Bayesian latent variable model which exploits conditional nonlinear(in)-dependence structures to learn an efficient latent representation. The latent space is factorized to represent shared and private information from multiple views of the data. In contrast to previous approaches, we introduce a relaxation to the discrete segmentation and allow for a "softly" shared latent space. Further, Bayesian techniques allow us to automatically estimate the dimensionality of the latent spaces. The model is capable of capturing structure underlying extremely high dimensional spaces. This is illustrated by modelling unprocessed images with tenths of thousands of pixels. This also allows us to directly generate novel images from the trained model by sampling from the discovered latent spaces. We also demonstrate the model by prediction of human pose in an ambiguous setting. Our Bayesian framework allows us to perform disambiguation in a principled manner by including latent space priors which incorporate the dynamic nature of the data.