Latent Space Non-Linear Statistics
This work addresses the challenge of applying statistical methods to latent representations in machine learning, which is incremental as it builds on existing nonlinear manifold statistics.
The paper tackles the problem of performing statistical analysis in the latent spaces of deep generative models, where linear methods are inapplicable due to nonlinear geometry, by developing techniques for maximum likelihood inference and approximating Riemannian metrics with neural networks to address computational complexity.
Given data, deep generative models, such as variational autoencoders (VAE) and generative adversarial networks (GAN), train a lower dimensional latent representation of the data space. The linear Euclidean geometry of data space pulls back to a nonlinear Riemannian geometry on the latent space. The latent space thus provides a low-dimensional nonlinear representation of data and classical linear statistical techniques are no longer applicable. In this paper we show how statistics of data in their latent space representation can be performed using techniques from the field of nonlinear manifold statistics. Nonlinear manifold statistics provide generalizations of Euclidean statistical notions including means, principal component analysis, and maximum likelihood fits of parametric probability distributions. We develop new techniques for maximum likelihood inference in latent space, and adress the computational complexity of using geometric algorithms with high-dimensional data by training a separate neural network to approximate the Riemannian metric and cometric tensor capturing the shape of the learned data manifold.