CVOct 23, 2025

EditInfinity: Image Editing with Binary-Quantized Generative Models

Jiahuan Wang, Yuxin Chen, Jun Yu, Guangming Lu, Wenjie Pei

arXiv:2510.20217v34 citationsh-index: 26Has Code

Originality Incremental advance

AI Analysis

This work improves image editing for users needing precise control, though it is incremental as it builds on existing adaptation paradigms.

The paper tackles the problem of text-driven image editing by addressing approximation errors in diffusion model inversion, proposing EditInfinity, a method based on binary-quantized generative models that achieves superior performance on the PIE-Bench benchmark across various editing operations.

Adapting pretrained diffusion-based generative models for text-driven image editing with negligible tuning overhead has demonstrated remarkable potential. A classical adaptation paradigm, as followed by these methods, first infers the generative trajectory inversely for a given source image by image inversion, then performs image editing along the inferred trajectory guided by the target text prompts. However, the performance of image editing is heavily limited by the approximation errors introduced during image inversion by diffusion models, which arise from the absence of exact supervision in the intermediate generative steps. To circumvent this issue, we investigate the parameter-efficient adaptation of binary-quantized generative models for image editing, and leverage their inherent characteristic that the exact intermediate quantized representations of a source image are attainable, enabling more effective supervision for precise image inversion. Specifically, we propose EditInfinity, which adapts \emph{Infinity}, a binary-quantized generative model, for image editing. We propose an efficient yet effective image inversion mechanism that integrates text prompting rectification and image style preservation, enabling precise image inversion. Furthermore, we devise a holistic smoothing strategy which allows our EditInfinity to perform image editing with high fidelity to source images and precise semantic alignment to the text prompts. Extensive experiments on the PIE-Bench benchmark across `add', `change', and `delete' editing operations, demonstrate the superior performance of our model compared to state-of-the-art diffusion-based baselines. Code available at: https://github.com/yx-chen-ust/EditInfinity.

View on arXiv PDF Code

Similar