LG DS OC PR MLJun 18, 2024

Evaluating the design space of diffusion-based generative models

arXiv:2406.12839v422.027 citations

Originality Incremental advance

AI Analysis

It offers a theoretical foundation for optimizing diffusion model design, addressing a gap in prior work that assumed score function accuracy, which is incremental but important for researchers and practitioners in generative AI.

This paper provides a non-asymptotic convergence analysis of denoising score matching under gradient descent and a refined sampling error analysis for variance exploding models, yielding a full error analysis that guides the design of training and sampling processes for diffusion-based generative models.

Most existing theoretical investigations of the accuracy of diffusion models, albeit significant, assume the score function has been approximated to a certain accuracy, and then use this a priori bound to control the error of generation. This article instead provides a first quantitative understanding of the whole generation process, i.e., both training and sampling. More precisely, it conducts a non-asymptotic convergence analysis of denoising score matching under gradient descent. In addition, a refined sampling error analysis for variance exploding models is also provided. The combination of these two results yields a full error analysis, which elucidates (again, but this time theoretically) how to design the training and sampling processes for effective generation. For instance, our theory implies a preference toward noise distribution and loss weighting in training that qualitatively agree with the ones used in [Karras et al., 2022]. It also provides perspectives on the choices of time and variance schedules in sampling: when the score is well trained, the design in [Song et al., 2021] is more preferable, but when it is less trained, the design in [Karras et al., 2022] becomes more preferable.

View on arXiv PDF

Similar