LGCVMay 31, 2021

Consistency Regularization for Variational Auto-Encoders

arXiv:2105.14859v277 citations
Originality Incremental advance
AI Analysis

This addresses a key limitation in VAEs for unsupervised learning, improving representation quality for downstream tasks, though it is an incremental enhancement to existing VAE methods.

The paper tackles the inconsistency problem in variational auto-encoders (VAEs), where encoders map semantically similar inputs to different latent representations, by proposing a consistency regularization method that minimizes KL divergence between variational distributions for original and transformed observations, resulting in improved representation quality and generalization, with state-of-the-art performance on MNIST and CIFAR-10 when applied to NVAE.

Variational auto-encoders (VAEs) are a powerful approach to unsupervised learning. They enable scalable approximate posterior inference in latent-variable models using variational inference (VI). A VAE posits a variational family parameterized by a deep neural network called an encoder that takes data as input. This encoder is shared across all the observations, which amortizes the cost of inference. However the encoder of a VAE has the undesirable property that it maps a given observation and a semantics-preserving transformation of it to different latent representations. This "inconsistency" of the encoder lowers the quality of the learned representations, especially for downstream tasks, and also negatively affects generalization. In this paper, we propose a regularization method to enforce consistency in VAEs. The idea is to minimize the Kullback-Leibler (KL) divergence between the variational distribution when conditioning on the observation and the variational distribution when conditioning on a random semantic-preserving transformation of this observation. This regularization is applicable to any VAE. In our experiments we apply it to four different VAE variants on several benchmark datasets and found it always improves the quality of the learned representations but also leads to better generalization. In particular, when applied to the Nouveau Variational Auto-Encoder (NVAE), our regularization method yields state-of-the-art performance on MNIST and CIFAR-10. We also applied our method to 3D data and found it learns representations of superior quality as measured by accuracy on a downstream classification task.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes