LGAIMLSep 18, 2020

Group Fairness by Probabilistic Modeling with Latent Fair Decisions

arXiv:2009.09031v243 citations
AI Analysis

This work addresses fairness in machine learning systems for high-stakes applications like loan approvals, but it is incremental as it builds on existing probabilistic modeling techniques.

The paper tackles the problem of learning fair probability distributions from biased data by modeling a latent variable representing an unbiased label, and demonstrates successful retrieval of fair labels on synthetic data and competitive accuracy on real-world datasets.

Machine learning systems are increasingly being used to make impactful decisions such as loan applications and criminal justice risk assessments, and as such, ensuring fairness of these systems is critical. This is often challenging as the labels in the data are biased. This paper studies learning fair probability distributions from biased data by explicitly modeling a latent variable that represents a hidden, unbiased label. In particular, we aim to achieve demographic parity by enforcing certain independencies in the learned model. We also show that group fairness guarantees are meaningful only if the distribution used to provide those guarantees indeed captures the real-world data. In order to closely model the data distribution, we employ probabilistic circuits, an expressive and tractable probabilistic model, and propose an algorithm to learn them from incomplete data. We evaluate our approach on a synthetic dataset in which observed labels indeed come from fair labels but with added bias, and demonstrate that the fair labels are successfully retrieved. Moreover, we show on real-world datasets that our approach not only is a better model than existing methods of how the data was generated but also achieves competitive accuracy.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes