CVSep 29, 2020

A Multi-term and Multi-task Analyzing Framework for Affective Analysis in-the-wild

Sachihiro Youoku, Yuushi Toyoda, Takahisa Yamamoto, Junya Saito, Ryosuke Kawamura, Xiaoyu Mi, Kentaro Murase

arXiv:2009.13885v27.214 citations

Originality Synthesis-oriented

AI Analysis

This work addresses the need for more accurate affective recognition in human-computer interaction, but it is incremental as it builds on existing contest submissions and methods.

The paper tackled the problem of inaccurate affective recognition from in-the-wild video data by proposing a multi-term and multi-task framework, achieving a valence-arousal score of 0.498 and a facial expression score of 0.471 on a validation set.

Human affective recognition is an important factor in human-computer interaction. However, the method development with in-the-wild data is not yet accurate enough for practical usage. In this paper, we introduce the affective recognition method focusing on valence-arousal (VA) and expression (EXP) that was submitted to the Affective Behavior Analysis in-the-wild (ABAW) 2020 Contest. Since we considered that affective behaviors have many observable features that have their own time frames, we introduced multiple optimized time windows (short-term, middle-term, and long-term) into our analyzing framework for extracting feature parameters from video data. Moreover, multiple modality data are used, including action units, head poses, gaze, posture, and ResNet 50 or Efficient NET features, and are optimized during the extraction of these features. Then, we generated affective recognition models for each time window and ensembled these models together. Also, we fussed the valence, arousal, and expression models together to enable the multi-task learning, considering the fact that the basic psychological states behind facial expressions are closely related to each another. In the validation set, our model achieved a valence-arousal score of 0.498 and a facial expression score of 0.471. These verification results reveal that our proposed framework can improve estimation accuracy and robustness effectively.

View on arXiv PDF

Similar