HCMar 31

FIRMED: A Peak-Centered Multimodal Dataset with Fine-Grained Annotation for Emotion Recognition

Hao Tang, Songyun Xie, Xinzhou Xie, Can Liao, Bohan Li, Zhongyu Tian, Dalu Zheng

arXiv:2507.023502.9h-index: 11

Predicted impact top 72% in HC · last 90 daysOriginality Incremental advance

AI Analysis

This addresses the need for temporally localized supervision in multimodal affective computing, offering a practical benchmark for researchers in emotion recognition, though it is incremental as it improves upon existing annotation methods.

The researchers tackled the problem of temporal label noise in video-induced physiological datasets for emotion recognition by introducing FIRMED, a peak-centered multimodal dataset with fine-grained annotations, which consistently outperformed whole-trial labeling with an average gain of 3.8 percentage points across eight EEG-based classifiers.

Traditional video-induced physiological datasets usually rely on whole-trial labels, which introduce temporal label noise in dynamic emotion recognition. We present FIRMED, a peak-centered multimodal dataset based on an immediate-recall annotation paradigm, with synchronized EEG, ECG, GSR, PPG, and facial recordings from 35 participants. FIRMED provides event-centered timestamps, emotion labels, and intensity annotations, and its annotation quality is supported by subjective and physiological validation. Benchmark experiments show that FIRMED consistently outperforms whole-trial labeling, yielding an average gain of 3.8 percentage points across eight EEG-based classifiers, with further improvements under multimodal fusion. FIRMED provides a practical benchmark for temporally localized supervision in multimodal affective computing.

View on arXiv PDF

Similar