CVMay 14, 2021

Meta Auxiliary Learning for Facial Action Unit Detection

arXiv:2105.06620v121 citations
Originality Highly original
AI Analysis

This work addresses the challenge of limited annotated data for AU detection, which is important for applications in affective computing and human-computer interaction, by leveraging easier-to-annotate facial expression data in a more effective multi-task framework.

The paper tackles the problem of facial action unit (AU) detection by addressing negative transfer in multi-task learning with facial expression recognition, proposing a Meta Auxiliary Learning method that automatically weights auxiliary samples to improve performance, achieving consistent gains over state-of-the-art methods on popular datasets.

Despite the success of deep neural networks on facial action unit (AU) detection, better performance depends on a large number of training images with accurate AU annotations. However, labeling AU is time-consuming, expensive, and error-prone. Considering AU detection and facial expression recognition (FER) are two highly correlated tasks, and facial expression (FE) is relatively easy to annotate, we consider learning AU detection and FER in a multi-task manner. However, the performance of the AU detection task cannot be always enhanced due to the negative transfer in the multi-task scenario. To alleviate this issue, we propose a Meta Auxiliary Learning method (MAL) that automatically selects highly related FE samples by learning adaptative weights for the training FE samples in a meta learning manner. The learned sample weights alleviate the negative transfer from two aspects: 1) balance the loss of each task automatically, and 2) suppress the weights of FE samples that have large uncertainties. Experimental results on several popular AU datasets demonstrate MAL consistently improves the AU detection performance compared with the state-of-the-art multi-task and auxiliary learning methods. MAL automatically estimates adaptive weights for the auxiliary FE samples according to their semantic relevance with the primary AU detection task.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes