LG AINov 1, 2023

FAIRLABEL: Correcting Bias in Labels

arXiv:2311.00638v13.81 citationsh-index: 3

Originality Incremental advance

AI Analysis

This addresses fairness issues in ML models for real-world applications by correcting biased labels, though it is incremental as it builds on existing fairness measurement methods.

The paper tackles the problem of biased ground truth labels in datasets by proposing FAIRLABEL, an algorithm that detects and corrects these biases, resulting in a label correction accuracy of 86.7% on synthetic data and up to a 54.2% increase in Disparate Impact Ratio on benchmark datasets.

There are several algorithms for measuring fairness of ML models. A fundamental assumption in these approaches is that the ground truth is fair or unbiased. In real-world datasets, however, the ground truth often contains data that is a result of historical and societal biases and discrimination. Models trained on these datasets will inherit and propagate the biases to the model outputs. We propose FAIRLABEL, an algorithm which detects and corrects biases in labels. The goal of FAIRLABELis to reduce the Disparate Impact (DI) across groups while maintaining high accuracy in predictions. We propose metrics to measure the quality of bias correction and validate FAIRLABEL on synthetic datasets and show that the label correction is correct 86.7% of the time vs. 71.9% for a baseline model. We also apply FAIRLABEL on benchmark datasets such as UCI Adult, German Credit Risk, and Compas datasets and show that the Disparate Impact Ratio increases by as much as 54.2%.

View on arXiv PDF

Similar