CV AI LGMar 14, 2023

Text-to-image Diffusion Models in Generative AI: A Survey

Chenshuang Zhang, Chaoning Zhang, Mengchun Zhang, In So Kweon, Junmo Kim

arXiv:2303.07909v340.2423 citationsh-index: 62

Originality Synthesis-oriented

AI Analysis

It provides a comprehensive overview for researchers and practitioners in generative AI, but is incremental as a survey.

This survey reviews the progress of diffusion models for generating images from text, summarizing pioneering methods, improvements, and applications like video generation and image editing.

This survey reviews the progress of diffusion models in generating images from text, ~\textit{i.e.} text-to-image diffusion models. As a self-contained work, this survey starts with a brief introduction of how diffusion models work for image synthesis, followed by the background for text-conditioned image synthesis. Based on that, we present an organized review of pioneering methods and their improvements on text-to-image generation. We further summarize applications beyond image generation, such as text-guided generation for various modalities like videos, and text-guided image editing. Beyond the progress made so far, we discuss existing challenges and promising future directions.

View on arXiv PDF

Similar