LG CYSep 27, 2023

Demographic Parity: Mitigating Biases in Real-World Data

arXiv:2309.17347v19 citationsh-index: 12

Originality Incremental advance

AI Analysis

This addresses fairness issues in sensitive areas like hiring and lending, though it appears incremental as it builds on existing demographic parity concepts.

The paper tackles the problem of biases in historical training data for computer-based decision systems by proposing a methodology that guarantees the removal of unwanted biases while preserving classification utility, achieving this through deriving an asymptotic dataset from real-world data to train classifiers without explicit or implicit bias.

Computer-based decision systems are widely used to automate decisions in many aspects of everyday life, which include sensitive areas like hiring, loaning and even criminal sentencing. A decision pipeline heavily relies on large volumes of historical real-world data for training its models. However, historical training data often contains gender, racial or other biases which are propagated to the trained models influencing computer-based decisions. In this work, we propose a robust methodology that guarantees the removal of unwanted biases while maximally preserving classification utility. Our approach can always achieve this in a model-independent way by deriving from real-world data the asymptotic dataset that uniquely encodes demographic parity and realism. As a proof-of-principle, we deduce from public census records such an asymptotic dataset from which synthetic samples can be generated to train well-established classifiers. Benchmarking the generalization capability of these classifiers trained on our synthetic data, we confirm the absence of any explicit or implicit bias in the computer-aided decision.

View on arXiv PDF

Similar