CVIVFeb 11, 2021

SWAGAN: A Style-based Wavelet-driven Generative Model

arXiv:2102.06108v144 citations
Originality Incremental advance
AI Analysis

This work addresses a specific bottleneck in GANs for image generation, offering incremental improvements in visual quality and efficiency for applications in computer vision and graphics.

The paper tackles the problem of high-frequency content degradation in GANs by introducing SWAGAN, a style-based wavelet-driven model that enforces frequency-aware representations, resulting in higher quality images with more realistic high-frequency details and improved computational performance.

In recent years, considerable progress has been made in the visual quality of Generative Adversarial Networks (GANs). Even so, these networks still suffer from degradation in quality for high-frequency content, stemming from a spectrally biased architecture, and similarly unfavorable loss functions. To address this issue, we present a novel general-purpose Style and WAvelet based GAN (SWAGAN) that implements progressive generation in the frequency domain. SWAGAN incorporates wavelets throughout its generator and discriminator architectures, enforcing a frequency-aware latent representation at every step of the way. This approach yields enhancements in the visual quality of the generated images, and considerably increases computational performance. We demonstrate the advantage of our method by integrating it into the SyleGAN2 framework, and verifying that content generation in the wavelet domain leads to higher quality images with more realistic high-frequency content. Furthermore, we verify that our model's latent space retains the qualities that allow StyleGAN to serve as a basis for a multitude of editing tasks, and show that our frequency-aware approach also induces improved downstream visual quality.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes