CVLGMar 20, 2024

Ground-A-Score: Scaling Up the Score Distillation for Multi-Attribute Editing

arXiv:2403.13551v13 citationsh-index: 11
Originality Incremental advance
AI Analysis

This addresses the challenge of multi-attribute image editing for users of text-to-image diffusion models, representing an incremental improvement over existing score distillation techniques.

The paper tackles the problem of text-to-image diffusion models overlooking some requests in complex prompts due to text processing bottlenecks, presenting Ground-A-Score, a model-agnostic image editing method that incorporates grounding during score distillation to precisely reflect intricate prompt requirements, with qualitative and quantitative analyses confirming it successfully adheres to extended and multifaceted prompts while preserving original image attributes.

Despite recent advancements in text-to-image diffusion models facilitating various image editing techniques, complex text prompts often lead to an oversight of some requests due to a bottleneck in processing text information. To tackle this challenge, we present Ground-A-Score, a simple yet powerful model-agnostic image editing method by incorporating grounding during score distillation. This approach ensures a precise reflection of intricate prompt requirements in the editing outcomes, taking into account the prior knowledge of the object locations within the image. Moreover, the selective application with a new penalty coefficient and contrastive loss helps to precisely target editing areas while preserving the integrity of the objects in the source image. Both qualitative assessments and quantitative analyses confirm that Ground-A-Score successfully adheres to the intricate details of extended and multifaceted prompts, ensuring high-quality outcomes that respect the original image attributes.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes