CVDec 7, 2023

Smooth Diffusion: Crafting Smooth Latent Spaces in Diffusion Models

arXiv:2312.04410v162 citationsh-index: 21Has CodeCVPR
Originality Incremental advance
AI Analysis

This addresses a specific issue in diffusion models for researchers and practitioners, offering a plug-and-play solution to improve latent space smoothness for tasks like image interpolation and editing, though it is incremental as it builds on existing diffusion frameworks.

The paper tackles the problem of non-smooth latent spaces in diffusion models, which cause visual fluctuations from minor latent variations, and proposes Smooth Diffusion with Step-wise Variation Regularization to achieve high performance and smoothness, demonstrating effectiveness in text-to-image generation and downstream tasks.

Recently, diffusion models have made remarkable progress in text-to-image (T2I) generation, synthesizing images with high fidelity and diverse contents. Despite this advancement, latent space smoothness within diffusion models remains largely unexplored. Smooth latent spaces ensure that a perturbation on an input latent corresponds to a steady change in the output image. This property proves beneficial in downstream tasks, including image interpolation, inversion, and editing. In this work, we expose the non-smoothness of diffusion latent spaces by observing noticeable visual fluctuations resulting from minor latent variations. To tackle this issue, we propose Smooth Diffusion, a new category of diffusion models that can be simultaneously high-performing and smooth. Specifically, we introduce Step-wise Variation Regularization to enforce the proportion between the variations of an arbitrary input latent and that of the output image is a constant at any diffusion training step. In addition, we devise an interpolation standard deviation (ISTD) metric to effectively assess the latent space smoothness of a diffusion model. Extensive quantitative and qualitative experiments demonstrate that Smooth Diffusion stands out as a more desirable solution not only in T2I generation but also across various downstream tasks. Smooth Diffusion is implemented as a plug-and-play Smooth-LoRA to work with various community models. Code is available at https://github.com/SHI-Labs/Smooth-Diffusion.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes