Attending to Emotional Narratives
This work applies established attention methods to a new dataset for affective computing, representing an incremental advancement.
The authors tackled emotion recognition from multimodal time-series data by adapting existing attention mechanisms (Transformer and Memory Fusion Network) to predict emotional valence, achieving performance comparable to human raters in some cases.
Attention mechanisms in deep neural networks have achieved excellent performance on sequence-prediction tasks. Here, we show that these recently-proposed attention-based mechanisms---in particular, the Transformer with its parallelizable self-attention layers, and the Memory Fusion Network with attention across modalities and time---also generalize well to multimodal time-series emotion recognition. Using a recently-introduced dataset of emotional autobiographical narratives, we adapt and apply these two attention mechanisms to predict emotional valence over time. Our models perform extremely well, in some cases reaching a performance comparable with human raters. We end with a discussion of the implications of attention mechanisms to affective computing.