Deterministic Dynamics of Sampling Processes in Score-Based Diffusion Models with Multiplicative Noise Conditioning
This work addresses a theoretical gap for researchers in generative modeling, offering insights into model behavior, but it is incremental as it builds on prior findings without introducing new methods or broad advancements.
The paper tackles the problem of why score-based diffusion models with multiplicative noise conditioning perform well in practice despite theoretical limitations in learning the correct score function, by providing a theoretical explanation through the study of deterministic dynamics in associated differential equations.
Score-based diffusion models generate new samples by learning the score function associated with a diffusion process. While the effectiveness of these models can be theoretically explained using differential equations related to the sampling process, previous work by Song and Ermon (2020) demonstrated that neural networks using multiplicative noise conditioning can still generate satisfactory samples. In this setup, the model is expressed as the product of two functions: one depending on the spatial variable and the other on the noise magnitude. This structure limits the model's ability to represent a more general relationship between the spatial variable and the noise, indicating that it cannot fully learn the correct score. Despite this limitation, the models perform well in practice. In this work, we provide a theoretical explanation for this phenomenon by studying the deterministic dynamics of the associated differential equations, offering insight into how the model operates.