Conditional Image Generation with Score-Based Diffusion Models
This work addresses the problem of improving conditional image generation for researchers and practitioners in generative AI, though it is incremental as it builds on existing diffusion models.
The paper systematically compares and theoretically analyzes methods for conditional image generation using score-based diffusion models, proving the justification for a successful estimator and introducing a multi-speed diffusion framework that performs on par with state-of-the-art approaches.
Score-based diffusion models have emerged as one of the most promising frameworks for deep generative modelling. In this work we conduct a systematic comparison and theoretical analysis of different approaches to learning conditional probability distributions with score-based diffusion models. In particular, we prove results which provide a theoretical justification for one of the most successful estimators of the conditional score. Moreover, we introduce a multi-speed diffusion framework, which leads to a new estimator for the conditional score, performing on par with previous state-of-the-art approaches. Our theoretical and experimental findings are accompanied by an open source library MSDiff which allows for application and further research of multi-speed diffusion models.