Generalization Properties of Score-matching Diffusion Models for Intrinsically Low-dimensional Data

Saptarshi Chakraborty, Quentin Berthet, Peter L. Bartlett

arXiv:2603.03700v16.76 citationsh-index: 68

Originality Incremental advance

AI Analysis

This provides theoretical foundations for diffusion models in machine learning, addressing a gap in understanding their convergence rates for real-world data like images, though it is incremental in extending prior analyses.

The paper tackles the problem of deriving statistical guarantees for score-based diffusion models on intrinsically low-dimensional data, showing that the expected Wasserstein-p error scales as O~(n^{-1/d*_{p,q}(μ)}), where d*_{p,q}(μ) is a new dimension measure, thus mitigating the curse of dimensionality.

Despite the remarkable empirical success of score-based diffusion models, their statistical guarantees remain underdeveloped. Existing analyses often provide pessimistic convergence rates that do not reflect the intrinsic low-dimensional structure common in real data, such as that arising in natural images. In this work, we study the statistical convergence of score-based diffusion models for learning an unknown distribution $μ$ from finitely many samples. Under mild regularity conditions on the forward diffusion process and the data distribution, we derive finite-sample error bounds on the learned generative distribution, measured in the Wasserstein-$p$ distance. Unlike prior results, our guarantees hold for all $p \ge 1$ and require only a finite-moment assumption on $μ$, without compact-support, manifold, or smooth-density conditions. Specifically, given $n$ i.i.d.\ samples from $μ$ with finite $q$-th moment and appropriately chosen network architectures, hyperparameters, and discretization schemes, we show that the expected Wasserstein-$p$ error between the learned distribution $\hatμ$ and $μ$ scales as $\mathbb{E}\, \mathbb{W}_p(\hatμ,μ) = \widetilde{O}\!\left(n^{-1 / d^\ast_{p,q}(μ)}\right),$ where $d^\ast_{p,q}(μ)$ is the $(p,q)$-Wasserstein dimension of $μ$. Our results demonstrate that diffusion models naturally adapt to the intrinsic geometry of data and mitigate the curse of dimensionality, since the convergence rate depends on $d^\ast_{p,q}(μ)$ rather than the ambient dimension. Moreover, our theory conceptually bridges the analysis of diffusion models with that of GANs and the sharp minimax rates established in optimal transport. The proposed $(p,q)$-Wasserstein dimension also extends classical Wasserstein dimension notions to distributions with unbounded support, which may be of independent theoretical interest.

View on arXiv PDF

Similar