LGCVJul 16, 2024

Isometric Representation Learning for Disentangled Latent Space of Diffusion Models

arXiv:2407.11451v121 citationsh-index: 6
Originality Incremental advance
AI Analysis

This work addresses the issue of distorted latent representations in diffusion models for generative modeling, offering incremental improvements in interpretability and control.

The paper tackled the problem of entangled latent spaces in diffusion models by introducing a geometric regularizer to learn a more disentangled representation, resulting in smoother interpolation, more accurate inversion, and precise attribute control as demonstrated in experiments.

The latent space of diffusion model mostly still remains unexplored, despite its great success and potential in the field of generative modeling. In fact, the latent space of existing diffusion models are entangled, with a distorted mapping from its latent space to image space. To tackle this problem, we present Isometric Diffusion, equipping a diffusion model with a geometric regularizer to guide the model to learn a geometrically sound latent space of the training data manifold. This approach allows diffusion models to learn a more disentangled latent space, which enables smoother interpolation, more accurate inversion, and more precise control over attributes directly in the latent space. Our extensive experiments consisting of image interpolations, image inversions, and linear editing show the effectiveness of our method.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes