CVMar 10, 2023

Self-supervised Facial Action Unit Detection with Region and Relation Learning

arXiv:2303.05708v111 citationsh-index: 5
Originality Incremental advance
AI Analysis

This work improves AU detection for applications like emotion analysis by incrementally advancing self-supervised learning with better region and relation modeling.

The paper tackles the problem of facial action unit (AU) detection by addressing the scarcity of manual annotations with a self-supervised framework that enhances local features and exploits AU correlations, achieving results comparable or superior to state-of-the-art methods on BP4D and DISFA datasets.

Facial action unit (AU) detection is a challenging task due to the scarcity of manual annotations. Recent works on AU detection with self-supervised learning have emerged to address this problem, aiming to learn meaningful AU representations from numerous unlabeled data. However, most existing AU detection works with self-supervised learning utilize global facial features only, while AU-related properties such as locality and relevance are not fully explored. In this paper, we propose a novel self-supervised framework for AU detection with the region and relation learning. In particular, AU related attention map is utilized to guide the model to focus more on AU-specific regions to enhance the integrity of AU local features. Meanwhile, an improved Optimal Transport (OT) algorithm is introduced to exploit the correlation characteristics among AUs. In addition, Swin Transformer is exploited to model the long-distance dependencies within each AU region during feature learning. The evaluation results on BP4D and DISFA demonstrate that our proposed method is comparable or even superior to the state-of-the-art self-supervised learning methods and supervised AU detection methods.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes