CV LGAug 14, 2019

FairFace: Face Attribute Dataset for Balanced Race, Gender, and Age

arXiv:1908.04913v129.9202 citationsHas Code

Originality Synthesis-oriented

AI Analysis

This addresses fairness and applicability issues in face analytic systems for underrepresented racial groups, though it is incremental as it focuses on dataset curation rather than algorithmic innovation.

The authors tackled the problem of racial bias in existing face datasets by constructing FairFace, a balanced dataset of 108,501 images across 7 race groups, which improved model accuracy and consistency across races and genders in evaluations.

Existing public face datasets are strongly biased toward Caucasian faces, and other races (e.g., Latino) are significantly underrepresented. This can lead to inconsistent model accuracy, limit the applicability of face analytic systems to non-White race groups, and adversely affect research findings based on such skewed data. To mitigate the race bias in these datasets, we construct a novel face image dataset, containing 108,501 images, with an emphasis of balanced race composition in the dataset. We define 7 race groups: White, Black, Indian, East Asian, Southeast Asian, Middle East, and Latino. Images were collected from the YFCC-100M Flickr dataset and labeled with race, gender, and age groups. Evaluations were performed on existing face attribute datasets as well as novel image datasets to measure generalization performance. We find that the model trained from our dataset is substantially more accurate on novel datasets and the accuracy is consistent between race and gender groups.

View on arXiv PDF Code

Similar