CVAug 30, 2017

Adversarial nets with perceptual losses for text-to-image synthesis

arXiv:1708.09321v136 citations
Originality Incremental advance
AI Analysis

This work addresses image quality issues in text-to-image synthesis for applications like content creation, but it is incremental as it builds on existing GAN methods.

The paper tackled the problem of visible flaws and lack of structural definition in images generated by GANs from text, by improving perceptual quality through perceptual loss functions, resulting in visually more compelling synthetic images of birds and flowers compared to existing work.

Recent approaches in generative adversarial networks (GANs) can automatically synthesize realistic images from descriptive text. Despite the overall fair quality, the generated images often expose visible flaws that lack structural definition for an object of interest. In this paper, we aim to extend state of the art for GAN-based text-to-image synthesis by improving perceptual quality of generated images. Differentiated from previous work, our synthetic image generator optimizes on perceptual loss functions that measure pixel, feature activation, and texture differences against a natural image. We present visually more compelling synthetic images of birds and flowers generated from text descriptions in comparison to some of the most prominent existing work.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes