Interpretable Deep Neural Networks for Dimensional and Categorical Emotion Recognition in-the-wild
This work addresses emotion recognition for applications like human-computer interaction and safety, but it is incremental as it builds on existing CNN+RNN methods with combined representations.
The paper tackled emotion recognition in-the-wild by extending a database and training CNN+RNN models with combined categorical and valence-arousal representations, finding that combined models improve categorical recognition performance, with mapping between representations explaining this effect.
Emotions play an important role in people's life. Understanding and recognising is not only important for interpersonal communication, but also has promising applications in Human-Computer Interaction, automobile safety and medical research. This project focuses on extending the emotion recognition database, and training the CNN + RNN emotion recognition neural networks with emotion category representation and valence \& arousal representation. The combined models are constructed by training the two representations simultaneously. The comparison and analysis between the three types of model are discussed. The inner-relationship between two emotion representations and the interpretability of the neural networks are investigated. The findings suggest that categorical emotion recognition performance can benefit from training with a combined model. And the mapping of emotion category and valence \& arousal values can explain this phenomenon.