CV HCApr 7, 2023

Masked Student Dataset of Expressions

arXiv:2304.03867v11.52 citationsh-index: 9Has Code

Originality Synthesis-oriented

AI Analysis

This work addresses a domain-specific problem for facial expression recognition systems, particularly relevant during the Covid-19 pandemic, by providing a real-world dataset and training methods to handle mask occlusion, though it is incremental as it builds on existing techniques.

The authors tackled the problem of facial expression recognition (FER) under real-world face mask occlusion by introducing a novel dataset, MSD-E, containing 1,960 real-world masked and non-masked images from 142 individuals. They found that using contrastive learning and knowledge distillation improved model performance in masked scenarios while maintaining non-masked accuracy.

Facial expression recognition (FER) algorithms work well in constrained environments with little or no occlusion of the face. However, real-world face occlusion is prevalent, most notably with the need to use a face mask in the current Covid-19 scenario. While there are works on the problem of occlusion in FER, little has been done before on the particular face mask scenario. Moreover, the few works in this area largely use synthetically created masked FER datasets. Motivated by these challenges posed by the pandemic to FER, we present a novel dataset, the Masked Student Dataset of Expressions or MSD-E, consisting of 1,960 real-world non-masked and masked facial expression images collected from 142 individuals. Along with the issue of obfuscated facial features, we illustrate how other subtler issues in masked FER are represented in our dataset. We then provide baseline results using ResNet-18, finding that its performance dips in the non-masked case when trained for FER in the presence of masks. To tackle this, we test two training paradigms: contrastive learning and knowledge distillation, and find that they increase the model's performance in the masked scenario while maintaining its non-masked performance. We further visualise our results using t-SNE plots and Grad-CAM, demonstrating that these paradigms capitalise on the limited features available in the masked scenario. Finally, we benchmark SOTA methods on MSD-E.

View on arXiv PDF Code

Similar