CVAIGRMar 19, 2021

Paint by Word

arXiv:2103.10951v3136 citations
AI Analysis

This work addresses the need for flexible, open-ended image editing tools for creative professionals and users, representing a novel application rather than an incremental improvement.

The paper tackles the problem of zero-shot semantic image painting, enabling users to modify synthesized images with arbitrary text descriptions like 'rustic' or 'happy dog' by pointing to specific locations. The method combines a generative model with a text-image similarity network, achieving results validated through user studies.

We investigate the problem of zero-shot semantic image painting. Instead of painting modifications into an image using only concrete colors or a finite set of semantic concepts, we ask how to create semantic paint based on open full-text descriptions: our goal is to be able to point to a location in a synthesized image and apply an arbitrary new concept such as "rustic" or "opulent" or "happy dog." To do this, our method combines a state-of-the art generative model of realistic images with a state-of-the-art text-image semantic similarity network. We find that, to make large changes, it is important to use non-gradient methods to explore latent space, and it is important to relax the computations of the GAN to target changes to a specific region. We conduct user studies to compare our methods to several baselines.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes