LG PRApr 1, 2025

Explainable post-training bias mitigation with distribution-based fairness metrics

Ryan Franks, Alexey Miroshnikov, Konstandinos Kotsiopoulos

arXiv:2504.01223v3h-index: 1

Originality Incremental advance

AI Analysis

This work addresses fairness issues in ML models for applications requiring demographic blindness, though it is incremental as it builds on existing post-processing methods.

The authors tackled the problem of bias in machine learning models by developing a post-processing framework with distribution-based fairness constraints, achieving efficient bias mitigation across various fairness levels without retraining, as demonstrated through empirical tests on multiple datasets.

We develop a novel bias mitigation framework with distribution-based fairness constraints suitable for producing demographically blind and explainable machine-learning models across a wide range of fairness levels. This is accomplished through post-processing, allowing fairer models to be generated efficiently without retraining the underlying model. Our framework, which is based on stochastic gradient descent, can be applied to a wide range of model types, with a particular emphasis on the post-processing of gradient-boosted decision trees. Additionally, we design a broad family of global fairness metrics, along with differentiable and consistent estimators compatible with our framework, building on previous work. We empirically test our methodology on a variety of datasets and compare it with alternative post-processing approaches, including Bayesian search, optimal transport projection, and direct neural network training.

View on arXiv PDF

Similar