Retiring $Δ$DP: New Distribution-Level Metrics for Demographic Parity
This addresses a foundational problem in fairness measurement for machine learning practitioners, offering more precise tools, though it is incremental as it refines rather than replaces the core concept.
The paper identifies flaws in the widely used demographic parity metric ΔDP, showing it fails to guarantee fairness and varies with thresholds, and proposes two new distribution-level metrics (ABPC and ABCC) that ensure zero violation and threshold invariance, with re-evaluation revealing different fairness behaviors in existing models.
Demographic parity is the most widely recognized measure of group fairness in machine learning, which ensures equal treatment of different demographic groups. Numerous works aim to achieve demographic parity by pursuing the commonly used metric $ΔDP$. Unfortunately, in this paper, we reveal that the fairness metric $ΔDP$ can not precisely measure the violation of demographic parity, because it inherently has the following drawbacks: i) zero-value $ΔDP$ does not guarantee zero violation of demographic parity, ii) $ΔDP$ values can vary with different classification thresholds. To this end, we propose two new fairness metrics, Area Between Probability density function Curves (ABPC) and Area Between Cumulative density function Curves (ABCC), to precisely measure the violation of demographic parity at the distribution level. The new fairness metrics directly measure the difference between the distributions of the prediction probability for different demographic groups. Thus our proposed new metrics enjoy: i) zero-value ABCC/ABPC guarantees zero violation of demographic parity; ii) ABCC/ABPC guarantees demographic parity while the classification thresholds are adjusted. We further re-evaluate the existing fair models with our proposed fairness metrics and observe different fairness behaviors of those models under the new metrics. The code is available at https://github.com/ahxt/new_metric_for_demographic_parity