ML LGMar 3, 2023

Diffusion Models are Minimax Optimal Distribution Estimators

Kazusato Oko, Shunta Akiyama, Taiji Suzuki

arXiv:2303.01861v139.0174 citationsh-index: 40

Originality Incremental advance

AI Analysis

This work addresses a theoretical gap in understanding diffusion models for distribution estimation, offering foundational insights that could impact machine learning practitioners and researchers.

The paper provides the first rigorous analysis of diffusion models' approximation and generalization abilities for well-known function spaces, showing that when the true density is in the Besov space and the empirical score matching loss is minimized, the generated distribution achieves nearly minimax optimal rates in total variation and Wasserstein distances.

While efficient distribution learning is no doubt behind the groundbreaking success of diffusion modeling, its theoretical guarantees are quite limited. In this paper, we provide the first rigorous analysis on approximation and generalization abilities of diffusion modeling for well-known function spaces. The highlight of this paper is that when the true density function belongs to the Besov space and the empirical score matching loss is properly minimized, the generated data distribution achieves the nearly minimax optimal estimation rates in the total variation distance and in the Wasserstein distance of order one. Furthermore, we extend our theory to demonstrate how diffusion models adapt to low-dimensional data distributions. We expect these results advance theoretical understandings of diffusion modeling and its ability to generate verisimilar outputs.

View on arXiv PDF

Similar