CVDec 21, 2023

DreamDistribution: Learning Prompt Distribution for Diverse In-distribution Generation

Harvard
arXiv:2312.14216v29 citationsh-index: 34ICLR
Originality Incremental advance
AI Analysis

This addresses the problem of limited diversity in personalized image generation for users of T2I models, representing an incremental improvement over existing methods.

The paper tackles the challenge of generating diverse customized images with reference visual attributes in Text-to-Image diffusion models by learning a set of soft prompts from a distribution, enabling novel image generation and text-guided editing. It demonstrates effectiveness through quantitative analysis and human assessment, showing adaptability to tasks like text-to-3D.

The popularization of Text-to-Image (T2I) diffusion models enables the generation of high-quality images from text descriptions. However, generating diverse customized images with reference visual attributes remains challenging. This work focuses on personalizing T2I diffusion models at a more abstract concept or category level, adapting commonalities from a set of reference images while creating new instances with sufficient variations. We introduce a solution that allows a pretrained T2I diffusion model to learn a set of soft prompts, enabling the generation of novel images by sampling prompts from the learned distribution. These prompts offer text-guided editing capabilities and additional flexibility in controlling variation and mixing between multiple distributions. We also show the adaptability of the learned prompt distribution to other tasks, such as text-to-3D. Finally we demonstrate effectiveness of our approach through quantitative analysis including automatic evaluation and human assessment. Project website: https://briannlongzhao.github.io/DreamDistribution

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes