BEE-NET: A deep neural network to identify in-the-wild Bodily Expression of Emotions
This work addresses the challenge of automatic emotion recognition from body language in uncontrolled environments, which is important for applications in human-computer interaction and psychology, though it is incremental as it builds on existing deep learning methods with specific enhancements.
The paper tackles the problem of identifying bodily expression of emotions in real-world settings by investigating how environmental factors like scenes and objects affect body language, introducing BEE-NET with a novel fusion strategy that improves state-of-the-art performance by 2.07% to achieve a 66.33% Emotional Recognition Score.
In this study, we investigate how environmental factors, specifically the scenes and objects involved, can affect the expression of emotions through body language. To this end, we introduce a novel multi-stream deep convolutional neural network named BEE-NET. We also propose a new late fusion strategy that incorporates meta-information on places and objects as prior knowledge in the learning process. Our proposed probabilistic pooling model leverages this information to generate a joint probability distribution of both available and anticipated non-available contextual information in latent space. Importantly, our fusion strategy is differentiable, allowing for end-to-end training and capturing of hidden associations among data points without requiring further post-processing or regularisation. To evaluate our deep model, we use the Body Language Database (BoLD), which is currently the largest available database for the Automatic Identification of the in-the-wild Bodily Expression of Emotions (AIBEE). Our experimental results demonstrate that our proposed approach surpasses the current state-of-the-art in AIBEE by a margin of 2.07%, achieving an Emotional Recognition Score of 66.33%.