GRAIJun 15, 2025

Balancing Preservation and Modification: A Region and Semantic Aware Metric for Instruction-Based Image Editing

arXiv:2506.13827v16 citationsh-index: 11Has CodeICML
Originality Incremental advance
AI Analysis

This addresses the problem of biased evaluation in image editing for researchers and practitioners, though it is incremental as it builds on existing metric adaptation efforts.

The paper tackles the lack of a comprehensive metric for instruction-based image editing by introducing BPM, a region and semantic aware metric that disentangles editing-relevant and irrelevant regions, achieving the highest alignment with human evaluation compared to existing metrics.

Instruction-based image editing, which aims to modify the image faithfully according to the instruction while preserving irrelevant content unchanged, has made significant progress. However, there still lacks a comprehensive metric for assessing the editing quality. Existing metrics either require high human evaluation costs, which hinder large-scale evaluation, or are adapted from other tasks and lose task-specific concerns, failing to comprehensively evaluate both instruction-based modification and preservation of irrelevant regions, resulting in biased evaluation. To tackle this, we introduce a new metric called Balancing Preservation and Modification (BPM), tailored for instruction-based image editing by explicitly disentangling the image into editing-relevant and irrelevant regions for specific consideration. We first identify and locate editing-relevant regions, followed by a two-tier process to assess editing quality: Region-Aware Judge evaluates whether the position and size of the edited region align with the instruction, and Semantic-Aware Judge further assesses the instruction content compliance within editing-relevant regions as well as content preservation within irrelevant regions, yielding comprehensive and interpretable quality assessment. Moreover, the editing-relevant region localization in BPM can be integrated into image editing approaches to improve editing quality, demonstrating its broad applicability. We verify the effectiveness of the BPM metric on comprehensive instruction-editing data, and the results show the highest alignment with human evaluation compared to existing metrics, indicating its efficacy. Code is available at: https://joyli-x.github.io/BPM/

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes