GRAICVAug 30, 2025

LatentEdit: Adaptive Latent Control for Consistent Semantic Editing

arXiv:2509.00541v1h-index: 8PRCV
Originality Incremental advance
AI Analysis

This work addresses the problem of maintaining background similarity and efficiency in diffusion-based image editing for users in computer vision and AI applications, representing an incremental improvement with a lightweight, plug-and-play solution.

The paper tackles the challenge of achieving high-quality, consistent semantic image editing with diffusion models by introducing LatentEdit, an adaptive latent fusion framework that dynamically combines source and target latents, resulting in an optimal balance between fidelity and editability and outperforming state-of-the-art methods in 8-15 steps.

Diffusion-based Image Editing has achieved significant success in recent years. However, it remains challenging to achieve high-quality image editing while maintaining the background similarity without sacrificing speed or memory efficiency. In this work, we introduce LatentEdit, an adaptive latent fusion framework that dynamically combines the current latent code with a reference latent code inverted from the source image. By selectively preserving source features in high-similarity, semantically important regions while generating target content in other regions guided by the target prompt, LatentEdit enables fine-grained, controllable editing. Critically, the method requires no internal model modifications or complex attention mechanisms, offering a lightweight, plug-and-play solution compatible with both UNet-based and DiT-based architectures. Extensive experiments on the PIE-Bench dataset demonstrate that our proposed LatentEdit achieves an optimal balance between fidelity and editability, outperforming the state-of-the-art method even in 8-15 steps. Additionally, its inversion-free variant further halves the number of neural function evaluations and eliminates the need for storing any intermediate variables, substantially enhancing real-time deployment efficiency.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes