CVGRLGApr 14, 2023

Delta Denoising Score

arXiv:2304.07090v1142 citationsh-index: 117
Originality Incremental advance
AI Analysis

This addresses a specific bottleneck in text-to-image diffusion models for image editing, offering incremental improvements in stability and quality.

The paper tackles the problem of noisy gradients in text-based image editing using Score Distillation Sampling (SDS) by introducing Delta Denoising Score (DDS), which removes erroneous directions to produce more stable and higher-quality outputs, outperforming existing methods.

We introduce Delta Denoising Score (DDS), a novel scoring function for text-based image editing that guides minimal modifications of an input image towards the content described in a target prompt. DDS leverages the rich generative prior of text-to-image diffusion models and can be used as a loss term in an optimization problem to steer an image towards a desired direction dictated by a text. DDS utilizes the Score Distillation Sampling (SDS) mechanism for the purpose of image editing. We show that using only SDS often produces non-detailed and blurry outputs due to noisy gradients. To address this issue, DDS uses a prompt that matches the input image to identify and remove undesired erroneous directions of SDS. Our key premise is that SDS should be zero when calculated on pairs of matched prompts and images, meaning that if the score is non-zero, its gradients can be attributed to the erroneous component of SDS. Our analysis demonstrates the competence of DDS for text based image-to-image translation. We further show that DDS can be used to train an effective zero-shot image translation model. Experimental results indicate that DDS outperforms existing methods in terms of stability and quality, highlighting its potential for real-world applications in text-based image editing.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes