CVOct 13, 2022

Shape Preserving Facial Landmarks with Graph Attention Networks

arXiv:2210.07233v144 citationsh-index: 22
Originality Incremental advance
AI Analysis

This work addresses facial landmark estimation for computer vision applications, offering an incremental improvement over existing methods by enhancing spatial modeling.

The paper tackles the problem of facial landmark estimation by addressing the weak spatial relationship learning of CNNs, proposing a model combining a CNN with Graph Attention Network regressors to achieve top performance in benchmarks on head pose and landmark estimation, with significant improvement in situations involving large changes in local appearance.

Top-performing landmark estimation algorithms are based on exploiting the excellent ability of large convolutional neural networks (CNNs) to represent local appearance. However, it is well known that they can only learn weak spatial relationships. To address this problem, we propose a model based on the combination of a CNN with a cascade of Graph Attention Network regressors. To this end, we introduce an encoding that jointly represents the appearance and location of facial landmarks and an attention mechanism to weigh the information according to its reliability. This is combined with a multi-task approach to initialize the location of graph nodes and a coarse-to-fine landmark description scheme. Our experiments confirm that the proposed model learns a global representation of the structure of the face, achieving top performance in popular benchmarks on head pose and landmark estimation. The improvement provided by our model is most significant in situations involving large changes in the local appearance of landmarks.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes