MLLGApr 18, 2016

Gaussian Copula Variational Autoencoders for Mixed Data

arXiv:1604.04960v115 citations
Originality Synthesis-oriented
AI Analysis

This work addresses a domain-specific challenge in generative modeling for mixed data types, offering an incremental improvement over existing methods.

The authors tackled the problem of modeling mixed categorical and continuous data with variational autoencoders by introducing a Gaussian copula approach, which improved data manifold capture and outperformed standard VAEs in experiments.

The variational autoencoder (VAE) is a generative model with continuous latent variables where a pair of probabilistic encoder (bottom-up) and decoder (top-down) is jointly learned by stochastic gradient variational Bayes. We first elaborate Gaussian VAE, approximating the local covariance matrix of the decoder as an outer product of the principal direction at a position determined by a sample drawn from Gaussian distribution. We show that this model, referred to as VAE-ROC, better captures the data manifold, compared to the standard Gaussian VAE where independent multivariate Gaussian was used to model the decoder. Then we extend the VAE-ROC to handle mixed categorical and continuous data. To this end, we employ Gaussian copula to model the local dependency in mixed categorical and continuous data, leading to {\em Gaussian copula variational autoencoder} (GCVAE). As in VAE-ROC, we use the rank-one approximation for the covariance in the Gaussian copula, to capture the local dependency structure in the mixed data. Experiments on various datasets demonstrate the useful behaviour of VAE-ROC and GCVAE, compared to the standard VAE.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes