CVFeb 1, 2022

Multi-Order Networks for Action Unit Detection

arXiv:2202.00446v216 citations
AI Analysis

This work addresses a specific bottleneck in multi-task learning for facial expression analysis, offering an incremental but impactful solution for affective computing applications.

The paper tackles the problem of arbitrary task ordering in multi-task learning for Action Unit detection, which causes performance variations, and introduces Multi-Order Network (MONET) with joint task order optimization, demonstrating significant state-of-the-art performance improvements.

Action Units (AU) are muscular activations used to describe facial expressions. Therefore accurate AU recognition unlocks unbiaised face representation which can improve face-based affective computing applications. From a learning standpoint AU detection is a multi-task problem with strong inter-task dependencies. To solve such problem, most approaches either rely on weight sharing, or add explicit dependency modelling by decomposing the joint task distribution using Bayes chain rule. If the latter strategy yields comprehensive inter-task relationships modelling, it requires imposing an arbitrary order into an unordered task set. Crucially, this ordering choice has been identified as a source of performance variations. In this paper, we present Multi-Order Network (MONET), a multi-task method with joint task order optimization. MONET uses a differentiable order selection to jointly learn task-wise modules with their optimal chaining order. Furthermore, we introduce warmup and order dropout to enhance order selection by encouraging order exploration. Experimentally, we first demonstrate MONET capacity to retrieve the optimal order in a toy environment. Second, we validate MONET architecture by showing that MONET outperforms existing multi-task baselines on multiple attribute detection problems chosen for their wide range of dependency settings. More importantly, we demonstrate that MONET significantly extends state-of-the-art performance in AU detection.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes