CVAug 30, 2024

How Knowledge Distillation Mitigates the Synthetic Gap in Fair Face Recognition

Pedro C. Neto, Ivona Colakovic, Sašo Karakatič, Ana F. Sequeira

arXiv:2408.17399v13.72 citationsh-index: 15Has Code

Originality Incremental advance

AI Analysis

This work addresses fairness and accuracy issues in face recognition models for applications relying on synthetic data, but it is incremental as it builds on existing knowledge distillation techniques.

The paper tackled the problem of synthetic data gaps in face recognition by using Knowledge Distillation from a teacher trained on real data to students trained on synthetic or mixed datasets, resulting in performance gains across all ethnicities and reduced bias.

Leveraging the capabilities of Knowledge Distillation (KD) strategies, we devise a strategy to fight the recent retraction of face recognition datasets. Given a pretrained Teacher model trained on a real dataset, we show that carefully utilising synthetic datasets, or a mix between real and synthetic datasets to distil knowledge from this teacher to smaller students can yield surprising results. In this sense, we trained 33 different models with and without KD, on different datasets, with different architectures and losses. And our findings are consistent, using KD leads to performance gains across all ethnicities and decreased bias. In addition, it helps to mitigate the performance gap between real and synthetic datasets. This approach addresses the limitations of synthetic data training, improving both the accuracy and fairness of face recognition models.

View on arXiv PDF Code

Similar