GRCVLGMar 14, 2025

LUSD: Localized Update Score Distillation for Text-Guided Image Editing

arXiv:2503.11054v21 citationsh-index: 16
Originality Incremental advance
AI Analysis

This work addresses a specific challenge in image editing for users needing precise control over object insertion while preserving backgrounds, representing an incremental improvement over existing score distillation methods.

The paper tackles the problem of text-guided image editing with diffusion models, where achieving both prompt fidelity and background preservation is difficult, especially for object insertion tasks. The proposed method, LUSD, introduces attention-based spatial regularization and gradient filtering-normalization to reduce gradient variations, outperforming state-of-the-art score distillation techniques with users preferring it by 58-64% overall.

While diffusion models show promising results in image editing given a target prompt, achieving both prompt fidelity and background preservation remains difficult. Recent works have introduced score distillation techniques that leverage the rich generative prior of text-to-image diffusion models to solve this task without additional fine-tuning. However, these methods often struggle with tasks such as object insertion. Our investigation of these failures reveals significant variations in gradient magnitude and spatial distribution, making hyperparameter tuning highly input-specific or unsuccessful. To address this, we propose two simple yet effective modifications: attention-based spatial regularization and gradient filtering-normalization, both aimed at reducing these variations during gradient updates. Experimental results show our method outperforms state-of-the-art score distillation techniques in prompt fidelity, improving successful edits while preserving the background. Users also preferred our method over state-of-the-art techniques across three metrics, and by 58-64% overall.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes