CLAIHCLGMay 4, 2023

Diffusion Explainer: Visual Explanation for Text-to-image Stable Diffusion

arXiv:2305.03509v333 citations
Originality Synthesis-oriented
AI Analysis

This tool addresses the problem of making advanced AI models accessible to non-experts, though it is incremental as it builds on existing visualization and explanation methods for generative models.

The authors tackled the challenge of understanding Stable Diffusion's complex text-to-image generation process by developing Diffusion Explainer, an interactive visualization tool that explains its operations and structure, which was validated through a 56-participant user study showing substantial learning benefits for non-experts and has been used by over 10,300 users globally.

Diffusion-based generative models' impressive ability to create convincing images has garnered global attention. However, their complex structures and operations often pose challenges for non-experts to grasp. We present Diffusion Explainer, the first interactive visualization tool that explains how Stable Diffusion transforms text prompts into images. Diffusion Explainer tightly integrates a visual overview of Stable Diffusion's complex structure with explanations of the underlying operations. By comparing image generation of prompt variants, users can discover the impact of keyword changes on image generation. A 56-participant user study demonstrates that Diffusion Explainer offers substantial learning benefits to non-experts. Our tool has been used by over 10,300 users from 124 countries at https://poloclub.github.io/diffusion-explainer/.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes