CVLGMar 19, 2023

Computer Vision Estimation of Emotion Reaction Intensity in the Wild

arXiv:2303.10741v27 citationsh-index: 49
Originality Synthesis-oriented
AI Analysis

This work addresses the need for more nuanced emotion prediction in applications like robotics and healthcare, though it is incremental as it builds on existing deep learning methods for a new benchmark.

The paper tackled the problem of estimating fine-grained emotion reaction intensity from visual and audio data, achieving an average Pearson correlation coefficient of 0.4080 on a test set using a pre-trained ResNet50 model.

Emotions play an essential role in human communication. Developing computer vision models for automatic recognition of emotion expression can aid in a variety of domains, including robotics, digital behavioral healthcare, and media analytics. There are three types of emotional representations which are traditionally modeled in affective computing research: Action Units, Valence Arousal (VA), and Categorical Emotions. As part of an effort to move beyond these representations towards more fine-grained labels, we describe our submission to the newly introduced Emotional Reaction Intensity (ERI) Estimation challenge in the 5th competition for Affective Behavior Analysis in-the-Wild (ABAW). We developed four deep neural networks trained in the visual domain and a multimodal model trained with both visual and audio features to predict emotion reaction intensity. Our best performing model on the Hume-Reaction dataset achieved an average Pearson correlation coefficient of 0.4080 on the test set using a pre-trained ResNet50 model. This work provides a first step towards the development of production-grade models which predict emotion reaction intensities rather than discrete emotion categories.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes