CVOct 2, 2016

Deep Feature Consistent Variational Autoencoder

arXiv:1610.00291v281 citations
Originality Incremental advance
AI Analysis

This work addresses image generation and representation quality for computer vision applications, but it is incremental as it builds on existing VAE and style transfer methods.

The authors tackled the problem of improving Variational Autoencoder (VAE) output quality by replacing pixel-by-pixel loss with deep feature consistency, resulting in more natural visual appearance and better perceptual quality on the CelebA dataset, with state-of-the-art performance in facial attribute prediction.

We present a novel method for constructing Variational Autoencoder (VAE). Instead of using pixel-by-pixel loss, we enforce deep feature consistency between the input and the output of a VAE, which ensures the VAE's output to preserve the spatial correlation characteristics of the input, thus leading the output to have a more natural visual appearance and better perceptual quality. Based on recent deep learning works such as style transfer, we employ a pre-trained deep convolutional neural network (CNN) and use its hidden features to define a feature perceptual loss for VAE training. Evaluated on the CelebA face dataset, we show that our model produces better results than other methods in the literature. We also show that our method can produce latent vectors that can capture the semantic information of face expressions and can be used to achieve state-of-the-art performance in facial attribute prediction.

Code Implementations14 repos

Data from Papers with Code (CC-BY-SA-4.0)

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes