CVMay 3, 2018

MC-GAN: Multi-conditional Generative Adversarial Network for Image Synthesis

arXiv:1805.01123v564 citations
Originality Incremental advance
AI Analysis

This addresses a novel task in text-to-image generation for preserving background context, which is incremental as it builds on existing methods but introduces a first attempt at this specific challenge.

The paper tackles the problem of generating object images from text attributes while preserving a given background, introducing MC-GAN to control both object and background information jointly, achieving photo-realistic 128x128 resolution images on bird and flower datasets.

In this paper, we introduce a new method for generating an object image from text attributes on a desired location, when the base image is given. One step further to the existing studies on text-to-image generation mainly focusing on the object's appearance, the proposed method aims to generate an object image preserving the given background information, which is the first attempt in this field. To tackle the problem, we propose a multi-conditional GAN (MC-GAN) which controls both the object and background information jointly. As a core component of MC-GAN, we propose a synthesis block which disentangles the object and background information in the training stage. This block enables MC-GAN to generate a realistic object image with the desired background by controlling the amount of the background information from the given base image using the foreground information from the text attributes. From the experiments with Caltech-200 bird and Oxford-102 flower datasets, we show that our model is able to generate photo-realistic images with a resolution of 128 x 128. The source code of MC-GAN is released.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes