LGCVAug 15, 2024

What Secrets Do Your Manifolds Hold? Understanding the Local Geometry of Generative Models

arXiv:2408.08307v27 citationsh-index: 17
Originality Incremental advance
AI Analysis

This work addresses the problem of understanding and improving generation quality in generative models for researchers and practitioners, offering incremental insights into manifold geometry.

The paper investigates the local geometry of generative models' learned manifolds and its impact on generation outcomes, showing that geometric descriptors predict aesthetics, diversity, and memorization, and enabling self-improvement in Stable Diffusion through geometry-based guidance.

Deep Generative Models are frequently used to learn continuous representations of complex data distributions using a finite number of samples. For any generative model, including pre-trained foundation models with Diffusion or Transformer architectures, generation performance can significantly vary across the learned data manifold. In this paper we study the local geometry of the learned manifold and its relationship to generation outcomes for a wide range of generative models, including DDPM, Diffusion Transformer (DiT), and Stable Diffusion 1.4. Building on the theory of continuous piecewise-linear (CPWL) generators, we characterize the local geometry in terms of three geometric descriptors - scaling ($ψ$), rank ($ν$), and complexity/un-smoothness ($δ$). We provide quantitative and qualitative evidence showing that for a given latent-image pair, the local descriptors are indicative of generation aesthetics, diversity, and memorization by the generative model. Finally, we demonstrate that by training a reward model on the local scaling for Stable Diffusion, we can self-improve both generation aesthetics and diversity using `geometry reward' based guidance during denoising.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes