CVMar 30

SEA: Evaluating Sketch Abstraction Efficiency via Element-level Commonsense Visual Question Answering

Jiho Park, Sieun Choi, Jaeyoon Seo, Minho Sohn, Yeana Kim, Jihie Kim

arXiv:2603.2836339.6h-index: 2

AI Analysis

This work addresses the challenge of evaluating sketch abstraction efficiency for researchers in computer vision and sketch analysis, though it is incremental as it builds on existing sketch understanding methods.

The authors tackled the problem of quantifying semantic abstraction efficiency in sketches by introducing SEA, a reference-free metric that evaluates how economically a sketch represents class-defining visual elements while preserving recognizability, achieving close alignment with human judgments. They also presented CommonSketch, a dataset of 23,100 sketches across 300 classes with annotations, serving as a benchmark for element-level sketch understanding.

A sketch is a distilled form of visual abstraction that conveys core concepts through simplified yet purposeful strokes while omitting extraneous detail. Despite its expressive power, quantifying the efficiency of semantic abstraction in sketches remains challenging. Existing evaluation methods that rely on reference images, low-level visual features, or recognition accuracy do not capture abstraction, the defining property of sketches. To address these limitations, we introduce SEA (Sketch Evaluation metric for Abstraction efficiency), a reference-free metric that assesses how economically a sketch represents class-defining visual elements while preserving semantic recognizability. These elements are derived per class from commonsense knowledge about features typically depicted in sketches. SEA leverages a visual question answering model to determine the presence of each element and returns a quantitative score that reflects semantic retention under visual economy. To support this metric, we present CommonSketch, the first semantically annotated sketch dataset, comprising 23,100 human-drawn sketches across 300 classes, each paired with a caption and element-level annotations. Experiments show that SEA aligns closely with human judgments and reliably discriminates levels of abstraction efficiency, while CommonSketch serves as a benchmark providing systematic evaluation of element-level sketch understanding across various vision-language models.

View on arXiv PDF

Similar