CVLGApr 14, 2020

Deep Entwined Learning Head Pose and Face Alignment Inside an Attentional Cascade with Doubly-Conditional fusion

arXiv:2004.06558v1
AI Analysis

This work addresses a preprocessing bottleneck for face analysis applications by integrating closely related tasks, though it is incremental as it builds on existing attentional cascade methods.

The paper tackles the joint problem of head pose estimation and face alignment by entwining them in an attentional cascade with a doubly-conditional fusion scheme, resulting in enhanced state-of-the-art accuracy on multiple databases for both tasks.

Head pose estimation and face alignment constitute a backbone preprocessing for many applications relying on face analysis. While both are closely related tasks, they are generally addressed separately, e.g. by deducing the head pose from the landmark locations. In this paper, we propose to entwine face alignment and head pose tasks inside an attentional cascade. This cascade uses a geometry transfer network for integrating heterogeneous annotations to enhance landmark localization accuracy. Furthermore, we propose a doubly-conditional fusion scheme to select relevant feature maps, and regions thereof, based on a current head pose and landmark localization estimate. We empirically show the benefit of entwining head pose and landmark localization objectives inside our architecture, and that the proposed AC-DC model enhances the state-of-the-art accuracy on multiple databases for both face alignment and head pose estimation tasks.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes