SinDDM: A Single Image Denoising Diffusion Model
This enables image generation and editing from minimal data, which is useful for applications with limited datasets, though it is an incremental improvement over existing diffusion models.
The authors tackled the problem of training denoising diffusion models on a single image instead of large datasets, resulting in a method that generates diverse high-quality samples applicable to tasks like style transfer and text-guided generation.
Denoising diffusion models (DDMs) have led to staggering performance leaps in image generation, editing and restoration. However, existing DDMs use very large datasets for training. Here, we introduce a framework for training a DDM on a single image. Our method, which we coin SinDDM, learns the internal statistics of the training image by using a multi-scale diffusion process. To drive the reverse diffusion process, we use a fully-convolutional light-weight denoiser, which is conditioned on both the noise level and the scale. This architecture allows generating samples of arbitrary dimensions, in a coarse-to-fine manner. As we illustrate, SinDDM generates diverse high-quality samples, and is applicable in a wide array of tasks, including style transfer and harmonization. Furthermore, it can be easily guided by external supervision. Particularly, we demonstrate text-guided generation from a single image using a pre-trained CLIP model.