CVAug 23, 2022

Towards Accurate Facial Landmark Detection via Cascaded Transformers

arXiv:2208.10808v157 citationsh-index: 16
Originality Incremental advance
AI Analysis

This work addresses the problem of accurate facial landmark detection for applications in human face-related tasks, representing an incremental improvement over existing methods.

The paper tackles facial landmark detection by proposing a cascaded transformer model that formulates the task as coordinate regression, achieving new state-of-the-art performance on standard benchmarks with improved generalization in cross-dataset evaluations.

Accurate facial landmarks are essential prerequisites for many tasks related to human faces. In this paper, an accurate facial landmark detector is proposed based on cascaded transformers. We formulate facial landmark detection as a coordinate regression task such that the model can be trained end-to-end. With self-attention in transformers, our model can inherently exploit the structured relationships between landmarks, which would benefit landmark detection under challenging conditions such as large pose and occlusion. During cascaded refinement, our model is able to extract the most relevant image features around the target landmark for coordinate prediction, based on deformable attention mechanism, thus bringing more accurate alignment. In addition, we propose a novel decoder that refines image features and landmark positions simultaneously. With few parameter increasing, the detection performance improves further. Our model achieves new state-of-the-art performance on several standard facial landmark detection benchmarks, and shows good generalization ability in cross-dataset evaluation.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes