CLOct 18, 2024

MiCEval: Unveiling Multimodal Chain of Thought's Quality via Image Description and Reasoning Steps

arXiv:2410.14668v420 citationsh-index: 15Has CodeNAACL
Originality Incremental advance
AI Analysis

This addresses a gap in evaluating multimodal reasoning for researchers and practitioners, but it is incremental as it builds on existing MCoT strategies.

The paper tackles the lack of automated methods for evaluating reasoning step quality in Multimodal Chain of Thought (MCoT) by proposing MiCEval, a framework that assesses correctness through image descriptions and step-wise reasoning, showing it aligns more closely with human judgments than existing methods.

Multimodal Chain of Thought (MCoT) is a popular prompting strategy for improving the performance of multimodal large language models (MLLMs) across a range of complex reasoning tasks. Despite its popularity, there is a notable absence of automated methods for evaluating the quality of reasoning steps in MCoT. To address this gap, we propose Multimodal Chain-of-Thought Evaluation (MiCEval), a framework designed to assess the correctness of reasoning chains by evaluating the quality of both the description and each reasoning step. The evaluation of the description component focuses on the accuracy of the image descriptions, while the reasoning step evaluates the quality of each step as it is conditionally generated based on the preceding steps. MiCEval is built upon a fine-grained dataset with annotations that rate each step according to correctness, relevance, and informativeness. Extensive experiments on four state-of-the-art MLLMs show that step-wise evaluations using MiCEval align more closely with human judgments compared to existing methods based on cosine similarity or fine-tuning approaches. MiCEval datasets and code can be found in https://github.com/alenai97/MiCEval.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes