LG CLMay 23, 2024

Synthetic Data Generation for Intersectional Fairness by Leveraging Hierarchical Group Structure

Gaurav Maheshwari, Aurélien Bellet, Pascal Denis, Mikaela Keller

arXiv:2405.14521v14.62 citationsh-index: 31

Originality Incremental advance

AI Analysis

This addresses fairness issues in machine learning for intersectionally marginalized groups, though it appears incremental as an adaptation of data augmentation techniques.

The paper tackles intersectional fairness in classification by introducing a data augmentation method that leverages hierarchical group structure to generate synthetic data for underrepresented intersectional groups. The approach achieves superior intersectional fairness and robustness against 'leveling down' compared to traditional group fairness methods across four diverse datasets.

In this paper, we introduce a data augmentation approach specifically tailored to enhance intersectional fairness in classification tasks. Our method capitalizes on the hierarchical structure inherent to intersectionality, by viewing groups as intersections of their parent categories. This perspective allows us to augment data for smaller groups by learning a transformation function that combines data from these parent groups. Our empirical analysis, conducted on four diverse datasets including both text and images, reveals that classifiers trained with this data augmentation approach achieve superior intersectional fairness and are more robust to ``leveling down'' when compared to methods optimizing traditional group fairness metrics.

View on arXiv PDF

Similar