GiMeFive: Towards Interpretable Facial Emotion Classification
This work addresses the need for interpretable facial emotion classification, which is crucial for applications in human-computer interaction and psychology, though it appears incremental as it builds on existing deep learning methods with added interpretability features.
The paper tackles the problem of unreliable and unexplainable facial emotion recognition by proposing GiMeFive, an interpretable model using layer activations and gradient-weighted class activation mapping, which outperforms state-of-the-art methods with improved accuracy on two benchmarks and a new aggregated dataset.
Deep convolutional neural networks have been shown to successfully recognize facial emotions for the past years in the realm of computer vision. However, the existing detection approaches are not always reliable or explainable, we here propose our model GiMeFive with interpretations, i.e., via layer activations and gradient-weighted class activation mapping. We compare against the state-of-the-art methods to classify the six facial emotions. Empirical results show that our model outperforms the previous methods in terms of accuracy on two Facial Emotion Recognition (FER) benchmarks and our aggregated FER GiMeFive. Furthermore, we explain our work in real-world image and video examples, as well as real-time live camera streams. Our code and supplementary material are available at https: //github.com/werywjw/SEP-CVDL.