CVAILGMar 14, 2023

Text-to-image Diffusion Models in Generative AI: A Survey

arXiv:2303.07909v3423 citationsh-index: 33
Originality Synthesis-oriented
AI Analysis

It provides a comprehensive overview for researchers and practitioners in generative AI, but is incremental as a survey.

This survey reviews the progress of diffusion models for generating images from text, summarizing pioneering methods, improvements, and applications like video generation and image editing.

This survey reviews the progress of diffusion models in generating images from text, ~\textit{i.e.} text-to-image diffusion models. As a self-contained work, this survey starts with a brief introduction of how diffusion models work for image synthesis, followed by the background for text-conditioned image synthesis. Based on that, we present an organized review of pioneering methods and their improvements on text-to-image generation. We further summarize applications beyond image generation, such as text-guided generation for various modalities like videos, and text-guided image editing. Beyond the progress made so far, we discuss existing challenges and promising future directions.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes