CVApr 23, 2024

DesignProbe: A Graphic Design Benchmark for Multimodal Large Language Models

arXiv:2404.14801v110 citationsh-index: 11
Originality Synthesis-oriented
AI Analysis

This work addresses the problem of assessing MLLM capabilities in graphic design for researchers and practitioners, but it is incremental as it focuses on benchmarking rather than novel model development.

The authors tackled the challenge of evaluating Multimodal Large Language Models (MLLMs) in graphic design by establishing DesignProbe, a benchmark with eight tasks across fine-grained elements and overall design levels, testing 9 MLLMs and finding that prompt refinement, especially with image examples, boosts performance.

A well-executed graphic design typically achieves harmony in two levels, from the fine-grained design elements (color, font and layout) to the overall design. This complexity makes the comprehension of graphic design challenging, for it needs the capability to both recognize the design elements and understand the design. With the rapid development of Multimodal Large Language Models (MLLMs), we establish the DesignProbe, a benchmark to investigate the capability of MLLMs in design. Our benchmark includes eight tasks in total, across both the fine-grained element level and the overall design level. At design element level, we consider both the attribute recognition and semantic understanding tasks. At overall design level, we include style and metaphor. 9 MLLMs are tested and we apply GPT-4 as evaluator. Besides, further experiments indicates that refining prompts can enhance the performance of MLLMs. We first rewrite the prompts by different LLMs and found increased performances appear in those who self-refined by their own LLMs. We then add extra task knowledge in two different ways (text descriptions and image examples), finding that adding images boost much more performance over texts.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes