Solving Prior Distribution Mismatch in Diffusion Models via Optimal Transport
This addresses a theoretical gap in diffusion models for researchers, though it is incremental as it builds on existing optimal transport concepts.
The paper tackles the problem of prior distribution mismatch in diffusion models by linking them to optimal transport theory, showing that as diffusion time increases, the probability flow converges to the gradient of the Monge-Ampère equation solution, and applies this to accelerate sampling with experimental validation on image datasets.
In recent years, the knowledge surrounding diffusion models(DMs) has grown significantly, though several theoretical gaps remain. Particularly noteworthy is prior error, defined as the discrepancy between the termination distribution of the forward process and the initial distribution of the reverse process. To address these deficiencies, this paper explores the deeper relationship between optimal transport(OT) theory and DMs with discrete initial distribution. Specifically, we demonstrate that the two stages of DMs fundamentally involve computing time-dependent OT. However, unavoidable prior error result in deviation during the reverse process under quadratic transport cost. By proving that as the diffusion termination time increases, the probability flow exponentially converges to the gradient of the solution to the classical Monge-Ampère equation, we establish a vital link between these fields. Therefore, static OT emerges as the most intrinsic single-step method for bridging this theoretical potential gap. Additionally, we apply these insights to accelerate sampling in both unconditional and conditional generation scenarios. Experimental results across multiple image datasets validate the effectiveness of our approach.