Does Privacy Always Harm Fairness? Data-Dependent Trade-offs via Chernoff Information Neural Estimation
This work addresses the problem of understanding trade-offs between fairness and privacy for machine learning practitioners, but it is incremental as it builds on existing information-theoretic concepts.
The paper tackles the relationship between fairness, privacy, and accuracy in machine learning by using Chernoff Information to show that this relationship is data-dependent, with synthetic data exhibiting three distinct behaviors based on data distribution, and it proposes a method to estimate Chernoff Information for real datasets.
Fairness and privacy are two vital pillars of trustworthy machine learning. Despite extensive research on these individual topics, the relationship between fairness and privacy has received significantly less attention. In this paper, we utilize the information-theoretic measure Chernoff Information to highlight the data-dependent nature of the relationship among the triad of fairness, privacy, and accuracy. We first define Noisy Chernoff Difference, a tool that allows us to analyze the relationship among the triad simultaneously. We then show that for synthetic data, this value behaves in 3 distinct ways (depending on the distribution of the data). We highlight the data distributions involved in these cases and explore their fairness and privacy implications. Additionally, we show that Noisy Chernoff Difference acts as a proxy for the steepness of the fairness-accuracy curves. Finally, we propose a method for estimating Chernoff Information on data from unknown distributions and utilize this framework to examine the triad dynamic on real datasets. This work builds towards a unified understanding of the fairness-privacy-accuracy relationship and highlights its data-dependent nature.