CVDec 27, 2023

Prompt Expansion for Adaptive Text-to-Image Generation

arXiv:2312.16720v138 citationsh-index: 7ACL
Originality Incremental advance
AI Analysis

This addresses the usability challenge for users of text-to-image models by reducing effort and improving output quality, though it is an incremental improvement over existing methods.

The paper tackles the problem of repetitive and low-quality images in text-to-image generation by proposing a Prompt Expansion framework that automatically generates optimized prompts, resulting in images rated as more aesthetically pleasing and diverse in human evaluations.

Text-to-image generation models are powerful but difficult to use. Users craft specific prompts to get better images, though the images can be repetitive. This paper proposes a Prompt Expansion framework that helps users generate high-quality, diverse images with less effort. The Prompt Expansion model takes a text query as input and outputs a set of expanded text prompts that are optimized such that when passed to a text-to-image model, generates a wider variety of appealing images. We conduct a human evaluation study that shows that images generated through Prompt Expansion are more aesthetically pleasing and diverse than those generated by baseline methods. Overall, this paper presents a novel and effective approach to improving the text-to-image generation experience.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes