Be Tangential to Manifold: Discovering Riemannian Metric for Diffusion Models
This work addresses a problem for researchers and practitioners in generative AI by enabling better manifold-aware analysis and editing in diffusion models, though it is incremental as it builds on known bottlenecks.
The paper tackles the lack of an explicit low-dimensional latent space in diffusion models, which limits manifold-aware operations like interpolation, by proposing a Riemannian metric on the noise space that encourages geodesics to align with the data manifold, resulting in perceptually more natural and faithful image transitions compared to existing methods.
Diffusion models are powerful deep generative models (DGMs) that generate high-fidelity, diverse content. However, unlike classical DGMs, they lack an explicit, tractable low-dimensional latent space that parameterizes the data manifold. This absence limits manifold-aware analysis and operations, such as interpolation and editing. Existing interpolation methods for diffusion models typically follow paths through high-density regions, which are not necessarily aligned with the data manifold and can yield perceptually unnatural transitions. To exploit the data manifold learned by diffusion models, we propose a novel Riemannian metric on the noise space, inspired by recent findings that the Jacobian of the score function captures the tangent spaces to the local data manifold. This metric encourages geodesics in the noise space to stay within or run parallel to the learned data manifold. Experiments on image interpolation show that our metric produces perceptually more natural and faithful transitions than existing density-based and naive baselines.