CVJul 21, 2017

Semantic Image Synthesis via Adversarial Learning

arXiv:1707.06873v1277 citations
Originality Incremental advance
AI Analysis

This work addresses the challenge of intelligent image manipulation for applications requiring semantic control, though it appears incremental as it builds on existing adversarial methods for image synthesis.

The paper tackles the problem of synthesizing realistic images from natural language descriptions while preserving irrelevant image features, achieving this through an adversarial learning model evaluated on bird and flower datasets.

In this paper, we propose a way of synthesizing realistic images directly with natural language description, which has many useful applications, e.g. intelligent image manipulation. We attempt to accomplish such synthesis: given a source image and a target text description, our model synthesizes images to meet two requirements: 1) being realistic while matching the target text description; 2) maintaining other image features that are irrelevant to the text description. The model should be able to disentangle the semantic information from the two modalities (image and text), and generate new images from the combined semantics. To achieve this, we proposed an end-to-end neural architecture that leverages adversarial learning to automatically learn implicit loss functions, which are optimized to fulfill the aforementioned two requirements. We have evaluated our model by conducting experiments on Caltech-200 bird dataset and Oxford-102 flower dataset, and have demonstrated that our model is capable of synthesizing realistic images that match the given descriptions, while still maintain other features of original images.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes