LG AI CVJun 8, 2023

Unscented Autoencoder

Faris Janjoš, Lars Rosenbaum, Maxim Dolgov, J. Marius Zöllner

arXiv:2306.05256v15.33 citationsh-index: 13Has Code

Originality Incremental advance

AI Analysis

This work addresses a specific bottleneck in generative modeling for researchers and practitioners, offering an incremental improvement over existing VAE methods.

The paper tackled the problem of high variance and low-quality reconstructions in Variational Autoencoders by introducing the Unscented Autoencoder, which uses deterministic sigma points from the Unscented Transform and replaces KL divergence with Wasserstein distance, resulting in competitive FID scores and lower training variance compared to VAEs.

The Variational Autoencoder (VAE) is a seminal approach in deep generative modeling with latent variables. Interpreting its reconstruction process as a nonlinear transformation of samples from the latent posterior distribution, we apply the Unscented Transform (UT) -- a well-known distribution approximation used in the Unscented Kalman Filter (UKF) from the field of filtering. A finite set of statistics called sigma points, sampled deterministically, provides a more informative and lower-variance posterior representation than the ubiquitous noise-scaling of the reparameterization trick, while ensuring higher-quality reconstruction. We further boost the performance by replacing the Kullback-Leibler (KL) divergence with the Wasserstein distribution metric that allows for a sharper posterior. Inspired by the two components, we derive a novel, deterministic-sampling flavor of the VAE, the Unscented Autoencoder (UAE), trained purely with regularization-like terms on the per-sample posterior. We empirically show competitive performance in Fréchet Inception Distance (FID) scores over closely-related models, in addition to a lower training variance than the VAE.

View on arXiv PDF Code

Similar