Speed-accuracy relations for diffusion models: Wisdom from nonequilibrium thermodynamics and optimal transport
This work provides a theoretical framework for improving diffusion model efficiency, which is incremental as it builds on existing thermodynamics and transport theories to address optimization challenges in generative AI.
The paper tackles the problem of optimizing data generation in diffusion models by deriving speed-accuracy relations that link generation accuracy to entropy production rates, using insights from nonequilibrium thermodynamics and optimal transport. It numerically validates these relations across various noise schedules and datasets, including real-world images, and introduces an optimal learning protocol based on geodesics in Wasserstein distance.
We discuss a connection between a generative model, called the diffusion model, and nonequilibrium thermodynamics for the Fokker-Planck equation, called stochastic thermodynamics. Using techniques from stochastic thermodynamics, we derive the speed-accuracy relations for diffusion models, which are inequalities that relate the accuracy of data generation to the entropy production rate. This relation can be interpreted as the speed of the diffusion dynamics in the absence of the non-conservative force. From a stochastic thermodynamic perspective, our results provide quantitative insight into how best to generate data in diffusion models. The optimal learning protocol is introduced by the geodesic of space of the 2-Wasserstein distance in optimal transport theory. We numerically illustrate the validity of the speed-accuracy relations for diffusion models with different noise schedules and different data. We numerically discuss our results for optimal and suboptimal learning protocols. We also demonstrate the applicability of our results to data generation from the real-world image datasets.