CVDec 12, 2025

Cross-modal Prompting for Balanced Incomplete Multi-modal Emotion Recognition

arXiv:2512.11239v21 citationsh-index: 10
Originality Incremental advance
AI Analysis

This work addresses the performance gap and modality under-optimization in multi-modal learning for emotion recognition, particularly under missing data conditions, representing an incremental improvement in the domain of affective computing.

The paper tackles the problem of incomplete multi-modal emotion recognition (IMER) by proposing a Cross-modal Prompting (ComP) method, which enhances modality-specific features and improves overall recognition accuracy, as validated through extensive experiments on 4 datasets with 7 state-of-the-art methods under various missing rates.

Incomplete multi-modal emotion recognition (IMER) aims at understanding human intentions and sentiments by comprehensively exploring the partially observed multi-source data. Although the multi-modal data is expected to provide more abundant information, the performance gap and modality under-optimization problem hinder effective multi-modal learning in practice, and are exacerbated in the confrontation of the missing data. To address this issue, we devise a novel Cross-modal Prompting (ComP) method, which emphasizes coherent information by enhancing modality-specific features and improves the overall recognition accuracy by boosting each modality's performance. Specifically, a progressive prompt generation module with a dynamic gradient modulator is proposed to produce concise and consistent modality semantic cues. Meanwhile, cross-modal knowledge propagation selectively amplifies the consistent information in modality features with the delivered prompts to enhance the discrimination of the modality-specific output. Additionally, a coordinator is designed to dynamically re-weight the modality outputs as a complement to the balance strategy to improve the model's efficacy. Extensive experiments on 4 datasets with 7 SOTA methods under different missing rates validate the effectiveness of our proposed method.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes