CVMay 22, 2025

KRIS-Bench: Benchmarking Next-Level Intelligent Image Editing Models

Yongliang Wu, Zonghui Li, Xinting Hu, Xinyu Ye, Xianfang Zeng, Gang Yu, Wenbo Zhu, Bernt Schiele, Ming-Hsuan Yang, Xu Yang

arXiv:2505.16707v134.859 citationsh-index: 137

Originality Incremental advance

AI Analysis

This addresses the problem of assessing reasoning capabilities in image editing models for AI researchers, though it is incremental as it builds on existing benchmarks by adding a knowledge-centric focus.

The authors tackled the lack of knowledge-based reasoning evaluation in instruction-based image editing by introducing KRIS-Bench, a diagnostic benchmark with 1,267 annotated instances across 22 tasks, revealing significant performance gaps in 10 state-of-the-art models.

Recent advances in multi-modal generative models have enabled significant progress in instruction-based image editing. However, while these models produce visually plausible outputs, their capacity for knowledge-based reasoning editing tasks remains under-explored. In this paper, we introduce KRIS-Bench (Knowledge-based Reasoning in Image-editing Systems Benchmark), a diagnostic benchmark designed to assess models through a cognitively informed lens. Drawing from educational theory, KRIS-Bench categorizes editing tasks across three foundational knowledge types: Factual, Conceptual, and Procedural. Based on this taxonomy, we design 22 representative tasks spanning 7 reasoning dimensions and release 1,267 high-quality annotated editing instances. To support fine-grained evaluation, we propose a comprehensive protocol that incorporates a novel Knowledge Plausibility metric, enhanced by knowledge hints and calibrated through human studies. Empirical results on 10 state-of-the-art models reveal significant gaps in reasoning performance, highlighting the need for knowledge-centric benchmarks to advance the development of intelligent image editing systems.

View on arXiv PDF

Similar