Improving Variational Autoencoder with Deep Feature Consistent and Generative Adversarial Training
This work addresses generating realistic face images and facial attribute manipulation, representing an incremental improvement to existing VAE methods.
The paper tackles improving variational autoencoder (VAE) performance by incorporating deep feature consistency and generative adversarial training, resulting in state-of-the-art face image generation with clearer features like noses and eyes, and achieving competitive facial attribute prediction.
We present a new method for improving the performances of variational autoencoder (VAE). In addition to enforcing the deep feature consistent principle thus ensuring the VAE output and its corresponding input images to have similar deep features, we also implement a generative adversarial training mechanism to force the VAE to output realistic and natural images. We present experimental results to show that the VAE trained with our new method outperforms state of the art in generating face images with much clearer and more natural noses, eyes, teeth, hair textures as well as reasonable backgrounds. We also show that our method can learn powerful embeddings of input face images, which can be used to achieve facial attribute manipulation. Moreover we propose a multi-view feature extraction strategy to extract effective image representations, which can be used to achieve state of the art performance in facial attribute prediction.