CVSep 26, 2025

Customizing Visual Emotion Evaluation for MLLMs: An Open-vocabulary, Multifaceted, and Scalable Approach

Daiqing Wu, Dongbao Yang, Sicheng Zhao, Can Ma, Yu Zhou

arXiv:2509.21950v13 citationsh-index: 23Has Code

Originality Incremental advance

AI Analysis

This work addresses the problem of evaluating emotional intelligence in MLLMs for researchers and developers, offering a scalable and customizable approach, though it is incremental as it builds on existing evaluation methods.

The paper tackles the inconsistent performance of Multimodal Large Language Models (MLLMs) in perceiving emotions from images by proposing an Emotion Statement Judgment task and an automated pipeline for evaluation, revealing that MLLMs show strengths in emotion interpretation and context-based judgment but lag behind humans with significant performance gaps.

Recently, Multimodal Large Language Models (MLLMs) have achieved exceptional performance across diverse tasks, continually surpassing previous expectations regarding their capabilities. Nevertheless, their proficiency in perceiving emotions from images remains debated, with studies yielding divergent results in zero-shot scenarios. We argue that this inconsistency stems partly from constraints in existing evaluation methods, including the oversight of plausible responses, limited emotional taxonomies, neglect of contextual factors, and labor-intensive annotations. To facilitate customized visual emotion evaluation for MLLMs, we propose an Emotion Statement Judgment task that overcomes these constraints. Complementing this task, we devise an automated pipeline that efficiently constructs emotion-centric statements with minimal human effort. Through systematically evaluating prevailing MLLMs, our study showcases their stronger performance in emotion interpretation and context-based emotion judgment, while revealing relative limitations in comprehending perception subjectivity. When compared to humans, even top-performing MLLMs like GPT4o demonstrate remarkable performance gaps, underscoring key areas for future improvement. By developing a fundamental evaluation framework and conducting a comprehensive MLLM assessment, we hope this work contributes to advancing emotional intelligence in MLLMs. Project page: https://github.com/wdqqdw/MVEI.

View on arXiv PDF Code

Similar