HCAIAug 9, 2023

PromptPaint: Steering Text-to-Image Generation Through Paint Medium-like Interactions

arXiv:2308.05184v1113 citationsh-index: 51Has Code
Originality Incremental advance
AI Analysis

This addresses the problem of iterative control in generative AI for users, though it is incremental as it builds on existing text-to-image models with new interaction methods.

The authors tackled the challenge of guiding text-to-image generation for concepts difficult to describe with language by introducing PromptPaint, which allows users to mix prompts and apply them to different canvas areas and times, providing insights into steerable generative tools through studies on mixing approaches and design trade-offs.

While diffusion-based text-to-image (T2I) models provide a simple and powerful way to generate images, guiding this generation remains a challenge. For concepts that are difficult to describe through language, users may struggle to create prompts. Moreover, many of these models are built as end-to-end systems, lacking support for iterative shaping of the image. In response, we introduce PromptPaint, which combines T2I generation with interactions that model how we use colored paints. PromptPaint allows users to go beyond language to mix prompts that express challenging concepts. Just as we iteratively tune colors through layered placements of paint on a physical canvas, PromptPaint similarly allows users to apply different prompts to different canvas areas and times of the generative process. Through a set of studies, we characterize different approaches for mixing prompts, design trade-offs, and socio-technical challenges for generative models. With PromptPaint we provide insight into future steerable generative tools.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes