CVLGMLOct 17, 2018

Coherence Constraints in Facial Expression Recognition

arXiv:1810.10326v12 citations
Originality Incremental advance
AI Analysis

This work addresses the problem of facial expression recognition for computer vision applications, but it is incremental as it builds on existing methods with coherence constraints.

The paper tackled facial expression recognition by applying CNNs to static images and video sequences, focusing on the importance of different face parts and using semi-supervised learning with coherence constraints, resulting in improved recognizer quality as shown in experimental analysis.

Recognizing facial expressions from static images or video sequences is a widely studied but still challenging problem. The recent progresses obtained by deep neural architectures, or by ensembles of heterogeneous models, have shown that integrating multiple input representations leads to state-of-the-art results. In particular, the appearance and the shape of the input face, or the representations of some face parts, are commonly used to boost the quality of the recognizer. This paper investigates the application of Convolutional Neural Networks (CNNs) with the aim of building a versatile recognizer of expressions in static images that can be further applied to video sequences. We first study the importance of different face parts in the recognition task, focussing on appearance and shape-related features. Then we cast the learning problem in the Semi-Supervised setting, exploiting video data, where only a few frames are supervised. The unsupervised portion of the training data is used to enforce three types of coherence, namely temporal coherence, coherence among the predictions on the face parts and coherence between appearance and shape-based representation. Our experimental analysis shows that coherence constraints can improve the quality of the expression recognizer, thus offering a suitable basis to profitably exploit unsupervised video sequences. Finally we present some examples with occlusions where the shape-based predictor performs better than the appearance one.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes