CVJul 25, 2024

FlexiEdit: Frequency-Aware Latent Refinement for Enhanced Non-Rigid Editing

arXiv:2407.17850v131 citationsh-index: 11
Originality Incremental advance
AI Analysis

This addresses a specific challenge in image editing for users needing complex layout adjustments, representing incremental progress.

The paper tackles the problem of non-rigid image editing, where current methods struggle with layout changes, by introducing FlexiEdit, which refines DDIM latent to reduce high-frequency components in targeted areas, resulting in enhanced fidelity to text prompts as demonstrated in comparative experiments.

Current image editing methods primarily utilize DDIM Inversion, employing a two-branch diffusion approach to preserve the attributes and layout of the original image. However, these methods encounter challenges with non-rigid edits, which involve altering the image's layout or structure. Our comprehensive analysis reveals that the high-frequency components of DDIM latent, crucial for retaining the original image's key features and layout, significantly contribute to these limitations. Addressing this, we introduce FlexiEdit, which enhances fidelity to input text prompts by refining DDIM latent, by reducing high-frequency components in targeted editing areas. FlexiEdit comprises two key components: (1) Latent Refinement, which modifies DDIM latent to better accommodate layout adjustments, and (2) Edit Fidelity Enhancement via Re-inversion, aimed at ensuring the edits more accurately reflect the input text prompts. Our approach represents notable progress in image editing, particularly in performing complex non-rigid edits, showcasing its enhanced capability through comparative experiments.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes