CVAIMar 19, 2024

Emotion Recognition Using Transformers with Masked Learning

arXiv:2403.13731v215 citations
Originality Incremental advance
AI Analysis

This work addresses emotion and behavior analysis for applications like affective computing, but it is incremental as it builds on existing Transformer models with specific adaptations.

The study tackled emotion recognition by proposing a Transformer-based framework with random frame masking and Focal loss, achieving enhanced accuracy for Valence-Arousal estimation, facial expression recognition, and Action Unit detection in real-world settings.

In recent years, deep learning has achieved innovative advancements in various fields, including the analysis of human emotions and behaviors. Initiatives such as the Affective Behavior Analysis in-the-wild (ABAW) competition have been particularly instrumental in driving research in this area by providing diverse and challenging datasets that enable precise evaluation of complex emotional states. This study leverages the Vision Transformer (ViT) and Transformer models to focus on the estimation of Valence-Arousal (VA), which signifies the positivity and intensity of emotions, recognition of various facial expressions, and detection of Action Units (AU) representing fundamental muscle movements. This approach transcends traditional Convolutional Neural Networks (CNNs) and Long Short-Term Memory (LSTM) based methods, proposing a new Transformer-based framework that maximizes the understanding of temporal and spatial features. The core contributions of this research include the introduction of a learning technique through random frame masking and the application of Focal loss adapted for imbalanced data, enhancing the accuracy and applicability of emotion and behavior analysis in real-world settings. This approach is expected to contribute to the advancement of emotional computing and deep learning methodologies.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes