Fiducial Focus Augmentation for Facial Landmark Detection
This addresses the problem of robust facial landmark detection for computer vision applications, though it appears incremental as it builds on existing augmentation and architecture techniques.
The paper tackles the challenge of facial landmark detection in difficult conditions like head pose changes and exaggerated expressions by proposing a novel image augmentation technique combined with a Siamese architecture and DCCA-based loss. Their approach outperforms multiple state-of-the-art methods across various benchmark datasets.
Deep learning methods have led to significant improvements in the performance on the facial landmark detection (FLD) task. However, detecting landmarks in challenging settings, such as head pose changes, exaggerated expressions, or uneven illumination, continue to remain a challenge due to high variability and insufficient samples. This inadequacy can be attributed to the model's inability to effectively acquire appropriate facial structure information from the input images. To address this, we propose a novel image augmentation technique specifically designed for the FLD task to enhance the model's understanding of facial structures. To effectively utilize the newly proposed augmentation technique, we employ a Siamese architecture-based training mechanism with a Deep Canonical Correlation Analysis (DCCA)-based loss to achieve collective learning of high-level feature representations from two different views of the input images. Furthermore, we employ a Transformer + CNN-based network with a custom hourglass module as the robust backbone for the Siamese framework. Extensive experiments show that our approach outperforms multiple state-of-the-art approaches across various benchmark datasets.