CLMay 20, 2025

Mixed Signals: Understanding Model Disagreement in Multimodal Empathy Detection

arXiv:2505.13979v32 citationsh-index: 4IJCNLP-AACL
Originality Incremental advance
AI Analysis

This work addresses a specific challenge in multimodal AI for empathy detection, offering incremental insights into model failures and diagnostic methods.

The paper tackled the problem of model disagreement in multimodal empathy detection, where conflicting cues across modalities lead to performance issues, and found that such disagreements often reflect underlying ambiguity and can serve as a diagnostic signal for improving system robustness.

Multimodal models play a key role in empathy detection, but their performance can suffer when modalities provide conflicting cues. To understand these failures, we examine cases where unimodal and multimodal predictions diverge. Using fine-tuned models for text, audio, and video, along with a gated fusion model, we find that such disagreements often reflect underlying ambiguity, as evidenced by annotator uncertainty. Our analysis shows that dominant signals in one modality can mislead fusion when unsupported by others. We also observe that humans, like models, do not consistently benefit from multimodal input. These insights position disagreement as a useful diagnostic signal for identifying challenging examples and improving empathy system robustness.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes