Convergence of Diffusion Models Under the Manifold Hypothesis in High-Dimensions
This work provides theoretical insights into why diffusion models work well in practice, addressing a foundational gap for researchers in machine learning, though it is incremental as it builds on existing theory.
The authors tackled the problem of understanding the convergence of Denoising Diffusion Probabilistic Models (DDPMs) under the manifold hypothesis, proving that these models achieve rates independent of ambient dimension for score learning and sampling complexity, with specific bounds like $O(\sqrt{D})$ for Wasserstein distance.
Denoising Diffusion Probabilistic Models (DDPM) are powerful state-of-the-art methods used to generate synthetic data from high-dimensional data distributions and are widely used for image, audio, and video generation as well as many more applications in science and beyond. The \textit{manifold hypothesis} states that high-dimensional data often lie on lower-dimensional manifolds within the ambient space, and is widely believed to hold in provided examples. While recent results have provided invaluable insight into how diffusion models adapt to the manifold hypothesis, they do not capture the great empirical success of these models, making this a very fruitful research direction. In this work, we study DDPMs under the manifold hypothesis and prove that they achieve rates independent of the ambient dimension in terms of score learning. In terms of sampling complexity, we obtain rates independent of the ambient dimension w.r.t. the Kullback-Leibler divergence, and $O(\sqrt{D})$ w.r.t. the Wasserstein distance. We do this by developing a new framework connecting diffusion models to the well-studied theory of extrema of Gaussian Processes.