LGMLApr 13, 2020

ControlVAE: Controllable Variational Autoencoder

arXiv:2004.05988v521 citations
AI Analysis

This addresses performance issues in VAE applications like language modeling and image generation, offering a novel control-based approach that is incremental but impactful for specific domains.

The authors tackled limitations of Variational Autoencoders (VAEs), such as KL vanishing in language modeling and low reconstruction quality, by proposing ControlVAE, a framework that uses a controller to automatically tune hyperparameters during training, resulting in better disentangling, improved text diversity, and enhanced image reconstruction quality compared to existing methods.

Variational Autoencoders (VAE) and their variants have been widely used in a variety of applications, such as dialog generation, image generation and disentangled representation learning. However, the existing VAE models have some limitations in different applications. For example, a VAE easily suffers from KL vanishing in language modeling and low reconstruction quality for disentangling. To address these issues, we propose a novel controllable variational autoencoder framework, ControlVAE, that combines a controller, inspired by automatic control theory, with the basic VAE to improve the performance of resulting generative models. Specifically, we design a new non-linear PI controller, a variant of the proportional-integral-derivative (PID) control, to automatically tune the hyperparameter (weight) added in the VAE objective using the output KL-divergence as feedback during model training. The framework is evaluated using three applications; namely, language modeling, disentangled representation learning, and image generation. The results show that ControlVAE can achieve better disentangling and reconstruction quality than the existing methods. For language modelling, it not only averts the KL-vanishing, but also improves the diversity of generated text. Finally, we also demonstrate that ControlVAE improves the reconstruction quality of generated images compared to the original VAE.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes