Fair Wrapping for Black-box Predictions
This work addresses fairness in machine learning for users of black-box models, presenting an incremental improvement by adapting existing improper loss function analysis to bias reduction.
The paper tackles the problem of reducing bias in black-box classifiers by introducing a post-processing technique that uses α-trees to modify predictions, demonstrating its application across three fairness notions with experiments on multiple datasets.
We introduce a new family of techniques to post-process ("wrap") a black-box classifier in order to reduce its bias. Our technique builds on the recent analysis of improper loss functions whose optimization can correct any twist in prediction, unfairness being treated as a twist. In the post-processing, we learn a wrapper function which we define as an $α$-tree, which modifies the prediction. We provide two generic boosting algorithms to learn $α$-trees. We show that our modification has appealing properties in terms of composition of $α$-trees, generalization, interpretability, and KL divergence between modified and original predictions. We exemplify the use of our technique in three fairness notions: conditional value-at-risk, equality of opportunity, and statistical parity; and provide experiments on several readily available datasets.