ML LGFeb 22, 2023

Distributional Learning of Variational AutoEncoder: Application to Synthetic Data Generation

arXiv:2302.11294v311.814 citationsh-index: 10Has Code

Originality Incremental advance

AI Analysis

This work addresses a known bottleneck in VAE models for researchers and practitioners in machine learning, offering an incremental improvement in synthetic data generation with enhanced privacy control.

The authors tackled the limitation of the Gaussianity assumption in Variational Autoencoders (VAEs) by proposing a model with a decoder based on an infinite mixture of asymmetric Laplace distributions, which improved distribution fitting for continuous variables and demonstrated superiority in adjusting data privacy levels for synthetic data generation.

The Gaussianity assumption has been consistently criticized as a main limitation of the Variational Autoencoder (VAE) despite its efficiency in computational modeling. In this paper, we propose a new approach that expands the model capacity (i.e., expressive power of distributional family) without sacrificing the computational advantages of the VAE framework. Our VAE model's decoder is composed of an infinite mixture of asymmetric Laplace distribution, which possesses general distribution fitting capabilities for continuous variables. Our model is represented by a special form of a nonparametric M-estimator for estimating general quantile functions, and we theoretically establish the relevance between the proposed model and quantile estimation. We apply the proposed model to synthetic data generation, and particularly, our model demonstrates superiority in easily adjusting the level of data privacy.

View on arXiv PDF Code

Similar