Score-based Generative Modeling Secretly Minimizes the Wasserstein Distance
This provides a theoretical foundation for score-based generative models, addressing a gap in understanding their optimization properties, though it is incremental as it builds on prior work linking them to KL divergence.
The paper shows that score-based generative models minimize the Wasserstein distance between generated and data distributions under certain assumptions, proving an upper bound related to the training objective and supporting it with numerical experiments.
Score-based generative models are shown to achieve remarkable empirical performances in various applications such as image generation and audio synthesis. However, a theoretical understanding of score-based diffusion models is still incomplete. Recently, Song et al. showed that the training objective of score-based generative models is equivalent to minimizing the Kullback-Leibler divergence of the generated distribution from the data distribution. In this work, we show that score-based models also minimize the Wasserstein distance between them under suitable assumptions on the model. Specifically, we prove that the Wasserstein distance is upper bounded by the square root of the objective function up to multiplicative constants and a fixed constant offset. Our proof is based on a novel application of the theory of optimal transport, which can be of independent interest to the society. Our numerical experiments support our findings. By analyzing our upper bounds, we provide a few techniques to obtain tighter upper bounds.