Measuring Unfairness through Game-Theoretic Interpretability
This work addresses the problem of evaluating fairness measures for researchers in machine learning fairness, but it is incremental as it builds on existing measures without introducing new ones.
The paper tackled the lack of comparison between fairness measures and feature importance measures by proposing evaluation methods, focusing on SHAP, and applied them to unfairness-prone datasets.
One often finds in the literature connections between measures of fairness and measures of feature importance employed to interpret trained classifiers. However, there seems to be no study that compares fairness measures and feature importance measures. In this paper we propose ways to evaluate and compare such measures. We focus in particular on SHAP, a game-theoretic measure of feature importance; we present results for a number of unfairness-prone datasets.