FaceOcc: A Diverse, High-quality Face Occlusion Dataset for Human Face Extraction
This addresses the issue of limited and low-quality face occlusion datasets for researchers and practitioners in computer vision, though it is incremental as it builds on existing datasets.
The paper tackles the problem of face occlusions in unconstrained images by creating a large, diverse face occlusion dataset, and shows that training a simple segmentation model with it achieves state-of-the-art performance.
Occlusions often occur in face images in the wild, troubling face-related tasks such as landmark detection, 3D reconstruction, and face recognition. It is beneficial to extract face regions from unconstrained face images accurately. However, current face segmentation datasets suffer from small data volumes, few occlusion types, low resolution, and imprecise annotation, limiting the performance of data-driven-based algorithms. This paper proposes a novel face occlusion dataset with manually labeled face occlusions from the CelebA-HQ and the internet. The occlusion types cover sunglasses, spectacles, hands, masks, scarfs, microphones, etc. To the best of our knowledge, it is by far the largest and most comprehensive face occlusion dataset. Combining it with the attribute mask in CelebAMask-HQ, we trained a straightforward face segmentation model but obtained SOTA performance, convincingly demonstrating the effectiveness of the proposed dataset.