Differentially Empirical Risk Minimization under the Fairness Lens
This addresses fairness issues in private machine learning systems, which is an incremental but important step for ensuring equitable outcomes in privacy-preserving AI.
The paper investigates how differential privacy (DP) in empirical risk minimization can worsen bias and unfairness across groups, analyzing causes of accuracy disparities in DP methods like output perturbation and stochastic gradient descent, and proposes mitigation guidelines evaluated on multiple datasets.
Differential Privacy (DP) is an important privacy-enhancing technology for private machine learning systems. It allows to measure and bound the risk associated with an individual participation in a computation. However, it was recently observed that DP learning systems may exacerbate bias and unfairness for different groups of individuals. This paper builds on these important observations and sheds light on the causes of the disparate impacts arising in the problem of differentially private empirical risk minimization. It focuses on the accuracy disparity arising among groups of individuals in two well-studied DP learning methods: output perturbation and differentially private stochastic gradient descent. The paper analyzes which data and model properties are responsible for the disproportionate impacts, why these aspects are affecting different groups disproportionately and proposes guidelines to mitigate these effects. The proposed approach is evaluated on several datasets and settings.