CVMay 5, 2017

Face Detection, Bounding Box Aggregation and Pose Estimation for Robust Facial Landmark Localisation in the Wild

Zhen-Hua Feng, Josef Kittler, Muhammad Awais, Patrik Huber, Xiao-Jun Wu

arXiv:1705.02402v26.642 citations

Originality Synthesis-oriented

AI Analysis

This work addresses the problem of accurate facial landmark detection in challenging real-world conditions for computer vision applications, representing an incremental improvement through integration of existing techniques.

The paper tackles robust facial landmark localization in unconstrained environments by proposing a multi-stage framework combining face detection, bounding box aggregation, pose estimation, and landmark localization, achieving superior results on the 300W and Menpo benchmarks compared to state-of-the-art methods.

We present a framework for robust face detection and landmark localisation of faces in the wild, which has been evaluated as part of `the 2nd Facial Landmark Localisation Competition'. The framework has four stages: face detection, bounding box aggregation, pose estimation and landmark localisation. To achieve a high detection rate, we use two publicly available CNN-based face detectors and two proprietary detectors. We aggregate the detected face bounding boxes of each input image to reduce false positives and improve face detection accuracy. A cascaded shape regressor, trained using faces with a variety of pose variations, is then employed for pose estimation and image pre-processing. Last, we train the final cascaded shape regressor for fine-grained landmark localisation, using a large number of training samples with limited pose variations. The experimental results obtained on the 300W and Menpo benchmarks demonstrate the superiority of our framework over state-of-the-art methods.

View on arXiv PDF

Similar