CVOct 20, 2022

Diffusion Models already have a Semantic Latent Space

arXiv:2210.10960v2376 citationsh-index: 15
Originality Incremental advance
AI Analysis

This work addresses the problem of controllable image generation for users of diffusion models, though it is incremental as it builds on existing models.

The authors tackled the lack of a semantic latent space in diffusion models, which is essential for controlling generation, by proposing an asymmetric reverse process that discovers such a space in frozen pretrained models, achieving properties like homogeneity and linearity for image manipulation.

Diffusion models achieve outstanding generative performance in various domains. Despite their great success, they lack semantic latent space which is essential for controlling the generative process. To address the problem, we propose asymmetric reverse process (Asyrp) which discovers the semantic latent space in frozen pretrained diffusion models. Our semantic latent space, named h-space, has nice properties for accommodating semantic image manipulation: homogeneity, linearity, robustness, and consistency across timesteps. In addition, we introduce a principled design of the generative process for versatile editing and quality boost ing by quantifiable measures: editing strength of an interval and quality deficiency at a timestep. Our method is applicable to various architectures (DDPM++, iD- DPM, and ADM) and datasets (CelebA-HQ, AFHQ-dog, LSUN-church, LSUN- bedroom, and METFACES). Project page: https://kwonminki.github.io/Asyrp/

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes