Easing Color Shifts in Score-Based Diffusion Models
This addresses a specific problem in image generation for researchers and practitioners using diffusion models, but it is incremental as it builds on a previously-introduced approach.
The paper tackled color shifts in score-based diffusion models by evaluating a nonlinear bypass connection in the score network, which processes the spatial mean to predict the score function's mean, resulting in substantially improved image quality that is independent of image size.
Generated images of score-based models can suffer from errors in their spatial means, an effect, referred to as a color shift, which grows for larger images. This paper investigates a previously-introduced approach to mitigate color shifts in score-based diffusion models. We quantify the performance of a nonlinear bypass connection in the score network, designed to process the spatial mean of the input and to predict the mean of the score function. We show that this network architecture substantially improves the resulting quality of the generated images, and that this improvement is approximately independent of the size of the generated images. As a result, this modified architecture offers a simple solution for the color shift problem across image sizes. We additionally discuss the origin of color shifts in an idealized setting in order to motivate the approach.