CVApr 27, 2020

Semantic Neighborhood-Aware Deep Facial Expression Recognition

arXiv:2004.12725v135 citations
AI Analysis

This work addresses the challenge of robust facial expression recognition for applications in human-computer interaction, but it is incremental as it builds on existing FER methods by focusing on output stability.

The paper tackles the problem of facial expression recognition (FER) by addressing dataset issues like imbalance, noise, and lack of data that hinder output consistency, proposing a method that considers neighborhood smoothness to improve stability against semantic perturbations, resulting in state-of-the-art performance with a 30% improvement over previous methods on AffectNet.

Different from many other attributes, facial expression can change in a continuous way, and therefore, a slight semantic change of input should also lead to the output fluctuation limited in a small scale. This consistency is important. However, current Facial Expression Recognition (FER) datasets may have the extreme imbalance problem, as well as the lack of data and the excessive amounts of noise, hindering this consistency and leading to a performance decreasing when testing. In this paper, we not only consider the prediction accuracy on sample points, but also take the neighborhood smoothness of them into consideration, focusing on the stability of the output with respect to slight semantic perturbations of the input. A novel method is proposed to formulate semantic perturbation and select unreliable samples during training, reducing the bad effect of them. Experiments show the effectiveness of the proposed method and state-of-the-art results are reported, getting closer to an upper limit than the state-of-the-art methods by a factor of 30\% in AffectNet, the largest in-the-wild FER database by now.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes