CVNov 8, 2024

Improving image synthesis with diffusion-negative sampling

arXiv:2411.05473v18 citationsh-index: 5ECCV
AI Analysis

This work addresses a practical problem for users of diffusion models in image synthesis by automating negative prompt generation, though it is incremental as it builds on existing prompting techniques.

The paper tackles the challenge of finding effective negative prompts for diffusion models in image generation by proposing a diffusion-negative prompting (DNP) strategy, which uses diffusion-negative sampling (DNS) to automatically generate negative prompts that improve prompt adherence and image quality, as validated through experiments and human evaluations.

For image generation with diffusion models (DMs), a negative prompt n can be used to complement the text prompt p, helping define properties not desired in the synthesized image. While this improves prompt adherence and image quality, finding good negative prompts is challenging. We argue that this is due to a semantic gap between humans and DMs, which makes good negative prompts for DMs appear unintuitive to humans. To bridge this gap, we propose a new diffusion-negative prompting (DNP) strategy. DNP is based on a new procedure to sample images that are least compliant with p under the distribution of the DM, denoted as diffusion-negative sampling (DNS). Given p, one such image is sampled, which is then translated into natural language by the user or a captioning model, to produce the negative prompt n*. The pair (p, n*) is finally used to prompt the DM. DNS is straightforward to implement and requires no training. Experiments and human evaluations show that DNP performs well both quantitatively and qualitatively and can be easily combined with several DM variants.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes