LGCYMay 31, 2022

Certifying Some Distributional Fairness with Subpopulation Decomposition

arXiv:2205.15494v217 citationsh-index: 70
Originality Incremental advance
AI Analysis

This work addresses the lack of certified fairness for end-to-end ML model performance, which is crucial for high-stakes domains like medical insurance and hiring, though it appears incremental as it builds on distributional robustness concepts.

The paper tackles the problem of certifying fairness in machine learning models by formulating it as an optimization problem based on performance loss bounds under fairness constraints, and proposes a framework using subpopulation decomposition to solve it. The results show tight certification in sensitive shifting scenarios and non-trivial certification under general shifting on six real-world datasets, with significantly tighter bounds compared to existing methods on Gaussian data.

Extensive efforts have been made to understand and improve the fairness of machine learning models based on observational metrics, especially in high-stakes domains such as medical insurance, education, and hiring decisions. However, there is a lack of certified fairness considering the end-to-end performance of an ML model. In this paper, we first formulate the certified fairness of an ML model trained on a given data distribution as an optimization problem based on the model performance loss bound on a fairness constrained distribution, which is within bounded distributional distance with the training distribution. We then propose a general fairness certification framework and instantiate it for both sensitive shifting and general shifting scenarios. In particular, we propose to solve the optimization problem by decomposing the original data distribution into analytical subpopulations and proving the convexity of the subproblems to solve them. We evaluate our certified fairness on six real-world datasets and show that our certification is tight in the sensitive shifting scenario and provides non-trivial certification under general shifting. Our framework is flexible to integrate additional non-skewness constraints and we show that it provides even tighter certification under different real-world scenarios. We also compare our certified fairness bound with adapted existing distributional robustness bounds on Gaussian data and demonstrate that our method is significantly tighter.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes