CVFeb 22, 2018

ChatPainter: Improving Text to Image Generation using Dialogue

arXiv:1802.08216v1101 citations
Originality Incremental advance
AI Analysis

This work addresses the challenge of insufficient information in captions for text-to-image generation, offering a domain-specific improvement for computer vision tasks.

The authors tackled the problem of generating realistic images from text descriptions on the MS COCO dataset by using dialogue instead of captions, resulting in significant improvements in inception score and image quality.

Synthesizing realistic images from text descriptions on a dataset like Microsoft Common Objects in Context (MS COCO), where each image can contain several objects, is a challenging task. Prior work has used text captions to generate images. However, captions might not be informative enough to capture the entire image and insufficient for the model to be able to understand which objects in the images correspond to which words in the captions. We show that adding a dialogue that further describes the scene leads to significant improvement in the inception score and in the quality of generated images on the MS COCO dataset.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes