Auto-Embedding Generative Adversarial Networks for High Resolution Image Synthesis
This addresses the challenge of high-resolution image synthesis for computer vision applications, but it is incremental as it builds on existing GAN methods.
The paper tackled the problem of generating high-resolution images with GANs, which often produce incomplete objects, by developing AEGAN to encode global structure and fine-grained details, resulting in 512x512 images with better perceptual photo-realism than baselines on datasets like CelebA-HQ and ImageNet.
Generating images via the generative adversarial network (GAN) has attracted much attention recently. However, most of the existing GAN-based methods can only produce low-resolution images of limited quality. Directly generating high-resolution images using GANs is nontrivial, and often produces problematic images with incomplete objects. To address this issue, we develop a novel GAN called Auto-Embedding Generative Adversarial Network (AEGAN), which simultaneously encodes the global structure features and captures the fine-grained details. In our network, we use an autoencoder to learn the intrinsic high-level structure of real images and design a novel denoiser network to provide photo-realistic details for the generated images. In the experiments, we are able to produce 512x512 images of promising quality directly from the input noise. The resultant images exhibit better perceptual photo-realism, i.e., with sharper structure and richer details, than other baselines on several datasets, including Oxford-102 Flowers, Caltech-UCSD Birds (CUB), High-Quality Large-scale CelebFaces Attributes (CelebA-HQ), Large-scale Scene Understanding (LSUN) and ImageNet.