CVJan 18, 2022

When Facial Expression Recognition Meets Few-Shot Learning: A Joint and Alternate Learning Framework

arXiv:2201.06781v122 citations
Originality Incremental advance
AI Analysis

This work addresses the problem of recognizing diverse human emotions in practical scenarios for applications like human-computer interaction, but it is incremental as it builds on existing few-shot learning and FER techniques.

The paper tackles compound facial expression recognition (FER) in a cross-domain few-shot learning setting, aiming to identify unseen compound expressions using models trained on basic expression datasets, and proposes an Emotion Guided Similarity Network (EGS-Net) that achieves superior performance on both in-the-lab and in-the-wild datasets compared to state-of-the-art methods.

Human emotions involve basic and compound facial expressions. However, current research on facial expression recognition (FER) mainly focuses on basic expressions, and thus fails to address the diversity of human emotions in practical scenarios. Meanwhile, existing work on compound FER relies heavily on abundant labeled compound expression training data, which are often laboriously collected under the professional instruction of psychology. In this paper, we study compound FER in the cross-domain few-shot learning setting, where only a few images of novel classes from the target domain are required as a reference. In particular, we aim to identify unseen compound expressions with the model trained on easily accessible basic expression datasets. To alleviate the problem of limited base classes in our FER task, we propose a novel Emotion Guided Similarity Network (EGS-Net), consisting of an emotion branch and a similarity branch, based on a two-stage learning framework. Specifically, in the first stage, the similarity branch is jointly trained with the emotion branch in a multi-task fashion. With the regularization of the emotion branch, we prevent the similarity branch from overfitting to sampled base classes that are highly overlapped across different episodes. In the second stage, the emotion branch and the similarity branch play a "two-student game" to alternately learn from each other, thereby further improving the inference ability of the similarity branch on unseen compound expressions. Experimental results on both in-the-lab and in-the-wild compound expression datasets demonstrate the superiority of our proposed method against several state-of-the-art methods.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes