MLLGMay 21, 2018

Self-Attention Generative Adversarial Networks

arXiv:1805.08318v24116 citations
Originality Highly original
AI Analysis

This addresses image generation challenges for AI researchers by enabling more consistent and detailed outputs, though it builds incrementally on existing GAN methods.

The paper tackles the problem of generating high-resolution images with long-range dependencies by proposing SAGAN, which uses self-attention to model global features, achieving state-of-the-art results with an Inception score increase from 36.8 to 52.52 and a Frechet Inception distance reduction from 27.62 to 18.65 on ImageNet.

In this paper, we propose the Self-Attention Generative Adversarial Network (SAGAN) which allows attention-driven, long-range dependency modeling for image generation tasks. Traditional convolutional GANs generate high-resolution details as a function of only spatially local points in lower-resolution feature maps. In SAGAN, details can be generated using cues from all feature locations. Moreover, the discriminator can check that highly detailed features in distant portions of the image are consistent with each other. Furthermore, recent work has shown that generator conditioning affects GAN performance. Leveraging this insight, we apply spectral normalization to the GAN generator and find that this improves training dynamics. The proposed SAGAN achieves the state-of-the-art results, boosting the best published Inception score from 36.8 to 52.52 and reducing Frechet Inception distance from 27.62 to 18.65 on the challenging ImageNet dataset. Visualization of the attention layers shows that the generator leverages neighborhoods that correspond to object shapes rather than local regions of fixed shape.

Code Implementations48 repos

Data from Papers with Code (CC-BY-SA-4.0)

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes