CV AI LGSep 15, 2023

Toward responsible face datasets: modeling the distribution of a disentangled latent space for sampling face images from demographic groups

Parsa Rahimi, Christophe Ecabert, Sebastien Marcel

arXiv:2309.08442v16.87 citationsh-index: 6

Originality Incremental advance

AI Analysis

This addresses fairness issues in facial recognition for affected demographic groups, offering an incremental improvement over existing dataset collection methods.

The paper tackles the problem of biased facial recognition systems by proposing a method to generate balanced synthetic face datasets from demographic groups using a disentangled StyleGAN latent space, enabling effective synthesis of any demographic combination with identities distinct from the original training data.

Recently, it has been exposed that some modern facial recognition systems could discriminate specific demographic groups and may lead to unfair attention with respect to various facial attributes such as gender and origin. The main reason are the biases inside datasets, unbalanced demographics, used to train theses models. Unfortunately, collecting a large-scale balanced dataset with respect to various demographics is impracticable. In this paper, we investigate as an alternative the generation of a balanced and possibly bias-free synthetic dataset that could be used to train, to regularize or to evaluate deep learning-based facial recognition models. We propose to use a simple method for modeling and sampling a disentangled projection of a StyleGAN latent space to generate any combination of demographic groups (e.g. $hispanic-female$). Our experiments show that we can synthesis any combination of demographic groups effectively and the identities are different from the original training dataset. We also released the source code.

View on arXiv PDF

Similar