CVLGAug 15, 2021

Self-supervised Contrastive Learning of Multi-view Facial Expressions

arXiv:2108.06723v144 citations
Originality Incremental advance
AI Analysis

This work addresses a domain-specific problem for human-computer interaction systems by improving facial expression recognition from challenging angles, though it is incremental as it builds on existing contrastive learning methods.

The paper tackled the problem of performance drop in facial expression recognition for non-frontal images by proposing CL-MEx, a self-supervised contrastive learning framework for multi-view facial expressions, achieving state-of-the-art results on KDEF and DDCF datasets.

Facial expression recognition (FER) has emerged as an important component of human-computer interaction systems. Despite recent advancements in FER, performance often drops significantly for non-frontal facial images. We propose Contrastive Learning of Multi-view facial Expressions (CL-MEx) to exploit facial images captured simultaneously from different angles towards FER. CL-MEx is a two-step training framework. In the first step, an encoder network is pre-trained with the proposed self-supervised contrastive loss, where it learns to generate view-invariant embeddings for different views of a subject. The model is then fine-tuned with labeled data in a supervised setting. We demonstrate the performance of the proposed method on two multi-view FER datasets, KDEF and DDCF, where state-of-the-art performances are achieved. Further experiments show the robustness of our method in dealing with challenging angles and reduced amounts of labeled data.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes