AIMLFeb 25, 2020

Counterfactual fairness: removing direct effects through regularization

arXiv:2002.10774v223 citations
AI Analysis

This addresses fairness in machine learning for unprivileged groups, but it is incremental as it builds on existing causal methods.

The paper tackles the problem of building fair machine learning models by proposing a new fairness definition based on Controlled Direct Effect (CDE) and developing regularizations to remove the impact of unprivileged group variables, demonstrating on synthetic and real-world datasets that it mitigates unfairness with small performance reductions.

Building machine learning models that are fair with respect to an unprivileged group is a topical problem. Modern fairness-aware algorithms often ignore causal effects and enforce fairness through modifications applicable to only a subset of machine learning models. In this work, we propose a new definition of fairness that incorporates causality through the Controlled Direct Effect (CDE). We develop regularizations to tackle classical fairness measures and present a causal regularization that satisfies our new fairness definition by removing the impact of unprivileged group variables on the model outcomes as measured by the CDE. These regularizations are applicable to any model trained using by iteratively minimizing a loss through differentiation. We demonstrate our approaches using both gradient boosting and logistic regression on: a synthetic dataset, the UCI Adult (Census) Dataset, and a real-world credit-risk dataset. Our results were found to mitigate unfairness from the predictions with small reductions in model performance.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes