AIFeb 9

Uncertainty-Aware Multimodal Emotion Recognition through Dirichlet Parameterization

arXiv:2602.09121v1
AI Analysis

This work addresses emotion recognition for healthcare and human-computer interaction applications, but it is incremental as it builds on existing methods with a novel fusion mechanism.

The paper tackles multimodal emotion recognition by proposing a lightweight, privacy-preserving framework for edge devices, achieving competitive accuracy on five benchmark datasets while being computationally efficient and robust to ambiguous or missing inputs.

In this work, we present a lightweight and privacy-preserving Multimodal Emotion Recognition (MER) framework designed for deployment on edge devices. To demonstrate framework's versatility, our implementation uses three modalities - speech, text and facial imagery. However, the system is fully modular, and can be extended to support other modalities or tasks. Each modality is processed through a dedicated backbone optimized for inference efficiency: Emotion2Vec for speech, a ResNet-based model for facial expressions, and DistilRoBERTa for text. To reconcile uncertainty across modalities, we introduce a model- and task-agnostic fusion mechanism grounded in Dempster-Shafer theory and Dirichlet evidence. Operating directly on model logits, this approach captures predictive uncertainty without requiring additional training or joint distribution estimation, making it broadly applicable beyond emotion recognition. Validation on five benchmark datasets (eNTERFACE05, MEAD, MELD, RAVDESS and CREMA-D) show that our method achieves competitive accuracy while remaining computationally efficient and robust to ambiguous or missing inputs. Overall, the proposed framework emphasizes modularity, scalability, and real-world feasibility, paving the way toward uncertainty-aware multimodal systems for healthcare, human-computer interaction, and other emotion-informed applications.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes