Avoiding Disparity Amplification under Different Worldviews
This work addresses fairness in machine learning by clarifying which definitions are appropriate under different assumptions about data bias, which is crucial for practitioners and policymakers to avoid amplifying disparities.
The paper mathematically compares four definitions of group-level nondiscrimination under different worldviews, arguing that demographic parity and equalized odds are motivated to avoid disparity amplification, while predictive parity and calibration are insufficient due to allowing large disparities or lack of robustness.
We mathematically compare four competing definitions of group-level nondiscrimination: demographic parity, equalized odds, predictive parity, and calibration. Using the theoretical framework of Friedler et al., we study the properties of each definition under various worldviews, which are assumptions about how, if at all, the observed data is biased. We argue that different worldviews call for different definitions of fairness, and we specify the worldviews that, when combined with the desire to avoid a criterion for discrimination that we call disparity amplification, motivate demographic parity and equalized odds. We also argue that predictive parity and calibration are insufficient for avoiding disparity amplification because predictive parity allows an arbitrarily large inter-group disparity and calibration is not robust to post-processing. Finally, we define a worldview that is more realistic than the previously considered ones, and we introduce a new notion of fairness that corresponds to this worldview.