LGFeb 18, 2025

Learning Counterfactually Fair Models via Improved Generation with Neural Causal Models

arXiv:2502.12796v2h-index: 3
Originality Incremental advance
AI Analysis

This work addresses fairness concerns for deploying ML models in real-world applications, though it appears incremental as it builds on existing counterfactual fairness methods.

The paper tackles the problem of ensuring counterfactual fairness in machine learning models by addressing limitations in generating faithful counterfactual samples and enforcing fairness conditions, resulting in an improved trade-off between fairness and generalization on synthetic and benchmark datasets.

One of the main concerns while deploying machine learning models in real-world applications is fairness. Counterfactual fairness has emerged as an intuitive and natural definition of fairness. However, existing methodologies for enforcing counterfactual fairness seem to have two limitations: (i) generating counterfactual samples faithful to the underlying causal graph, and (ii) as we argue in this paper, existing regularizers are mere proxies and do not directly enforce the exact definition of counterfactual fairness. In this work, our aim is to mitigate both issues. Firstly, we propose employing Neural Causal Models (NCMs) for generating the counterfactual samples. For implementing the abduction step in NCMs, the posteriors of the exogenous variables need to be estimated given a counterfactual query, as they are not readily available. As a consequence, $\mathcal{L}_3$ consistency with respect to the underlying causal graph cannot be guaranteed in practice due to the estimation errors involved. To mitigate this issue, we propose a novel kernel least squares loss term that enforces the $\mathcal{L}_3$ constraints explicitly. Thus, we obtain an improved counterfactual generation suitable for the counterfactual fairness task. Secondly, we propose a new MMD-based regularizer term that explicitly enforces the counterfactual fairness conditions into the base model while training. We show an improved trade-off between counterfactual fairness and generalization over existing baselines on synthetic and benchmark datasets.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes