CV LGOct 20, 2022

SS-VAERR: Self-Supervised Apparent Emotional Reaction Recognition from Video

Marija Jegorova, Stavros Petridis, Maja Pantic

arXiv:2210.11341v13.75 citationsh-index: 90

Originality Synthesis-oriented

AI Analysis

This work addresses video-based emotion recognition for applications like human-computer interaction, but it is incremental as it analyzes existing pretext tasks and loss combinations.

The paper tackled apparent emotional reaction recognition from video using self-supervised learning, achieving state-of-the-art performance with continuous annotations.

This work focuses on the apparent emotional reaction recognition (AERR) from the video-only input, conducted in a self-supervised fashion. The network is first pre-trained on different self-supervised pretext tasks and later fine-tuned on the downstream target task. Self-supervised learning facilitates the use of pre-trained architectures and larger datasets that might be deemed unfit for the target task and yet might be useful to learn informative representations and hence provide useful initializations for further fine-tuning on smaller more suitable data. Our presented contribution is two-fold: (1) an analysis of different state-of-the-art (SOTA) pretext tasks for the video-only apparent emotional reaction recognition architecture, and (2) an analysis of various combinations of the regression and classification losses that are likely to improve the performance further. Together these two contributions result in the current state-of-the-art performance for the video-only spontaneous apparent emotional reaction recognition with continuous annotations.

View on arXiv PDF

Similar