LG CVJun 14, 2023

InfoDiffusion: Representation Learning Using Information Maximizing Diffusion Models

Yingheng Wang, Yair Schiff, Aaron Gokaslan, Weishen Pan, Fei Wang, Christopher De Sa, Volodymyr Kuleshov

arXiv:2306.08757v126.568 citationsh-index: 62

Originality Incremental advance

AI Analysis

This addresses the need for interpretable latent spaces in diffusion models, potentially aiding tasks like generative design, though it is incremental as it builds on existing diffusion frameworks.

The paper tackles the problem of diffusion models lacking semantic latent variables for representation learning by proposing InfoDiffusion, which augments them with low-dimensional latents to capture high-level data variations, resulting in disentangled and human-interpretable representations competitive with state-of-the-art methods while maintaining high sample quality.

While diffusion models excel at generating high-quality samples, their latent variables typically lack semantic meaning and are not suitable for representation learning. Here, we propose InfoDiffusion, an algorithm that augments diffusion models with low-dimensional latent variables that capture high-level factors of variation in the data. InfoDiffusion relies on a learning objective regularized with the mutual information between observed and hidden variables, which improves latent space quality and prevents the latents from being ignored by expressive diffusion-based decoders. Empirically, we find that InfoDiffusion learns disentangled and human-interpretable latent representations that are competitive with state-of-the-art generative and contrastive methods, while retaining the high sample quality of diffusion models. Our method enables manipulating the attributes of generated images and has the potential to assist tasks that require exploring a learned latent space to generate quality samples, e.g., generative design.

View on arXiv PDF

Similar