An Audio-Video Deep and Transfer Learning Framework for Multimodal Emotion Recognition in the wild
This work addresses emotion recognition for applications in human-computer interaction, but it is incremental as it builds on existing deep learning and transfer learning methods.
The paper tackled multimodal emotion recognition in the wild by developing an audio-video deep and transfer learning framework, achieving a test set performance of 42.10% on the ABAW facial expression challenge.
In this paper, we present our contribution to ABAW facial expression challenge. We report the proposed system and the official challenge results adhering to the challenge protocol. Using end-to-end deep learning and benefiting from transfer learning approaches, we reached a test set challenge performance measure of 42.10%.