CVMay 1, 2025

Towards Scalable Human-aligned Benchmark for Text-guided Image Editing

arXiv:2505.00502v16 citationsh-index: 1Has CodeCVPR
Originality Synthesis-oriented
AI Analysis

This addresses the problem of subjective evaluation in text-guided image editing for researchers, though it is incremental as it builds on existing models and tasks.

The authors tackled the lack of a standard evaluation method for text-guided image editing by introducing HATIE, a large-scale benchmark with an automated pipeline that aligns with human perception, and they empirically verified its alignment and provided benchmark results on state-of-the-art models.

A variety of text-guided image editing models have been proposed recently. However, there is no widely-accepted standard evaluation method mainly due to the subjective nature of the task, letting researchers rely on manual user study. To address this, we introduce a novel Human-Aligned benchmark for Text-guided Image Editing (HATIE). Providing a large-scale benchmark set covering a wide range of editing tasks, it allows reliable evaluation, not limited to specific easy-to-evaluate cases. Also, HATIE provides a fully-automated and omnidirectional evaluation pipeline. Particularly, we combine multiple scores measuring various aspects of editing so as to align with human perception. We empirically verify that the evaluation of HATIE is indeed human-aligned in various aspects, and provide benchmark results on several state-of-the-art models to provide deeper insights on their performance.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes