A Statistical Test for Probabilistic Fairness
This work addresses the critical problem of detecting algorithmic bias in pre-trained classifiers for practitioners deploying machine learning models in high-stakes applications, offering an incremental but rigorous testing mechanism.
This paper proposes a statistical hypothesis test for detecting unfair logistic classifiers, leveraging optimal transport theory to quantify the distance of empirical data distributions from a manifold of fair distributions. The test is shown to be asymptotically correct both theoretically and empirically.
Algorithms are now routinely used to make consequential decisions that affect human lives. Examples include college admissions, medical interventions or law enforcement. While algorithms empower us to harness all information hidden in vast amounts of data, they may inadvertently amplify existing biases in the available datasets. This concern has sparked increasing interest in fair machine learning, which aims to quantify and mitigate algorithmic discrimination. Indeed, machine learning models should undergo intensive tests to detect algorithmic biases before being deployed at scale. In this paper, we use ideas from the theory of optimal transport to propose a statistical hypothesis test for detecting unfair classifiers. Leveraging the geometry of the feature space, the test statistic quantifies the distance of the empirical distribution supported on the test samples to the manifold of distributions that render a pre-trained classifier fair. We develop a rigorous hypothesis testing mechanism for assessing the probabilistic fairness of any pre-trained logistic classifier, and we show both theoretically as well as empirically that the proposed test is asymptotically correct. In addition, the proposed framework offers interpretability by identifying the most favorable perturbation of the data so that the given classifier becomes fair.