CLAILGOct 24, 2024

Does Differential Privacy Impact Bias in Pretrained NLP Models?

arXiv:2410.18749v12 citationsh-index: 2
Originality Incremental advance
AI Analysis

This addresses fairness concerns in privacy-preserving NLP for underrepresented groups, though it is incremental as it builds on prior findings about DP and bias.

The study investigated how differential privacy (DP) affects bias in pre-trained large language models (LLMs) during fine-tuning, finding that DP training increases bias against protected groups, as measured by AUC-based metrics, with the impact varying based on privacy levels and dataset distribution.

Differential privacy (DP) is applied when fine-tuning pre-trained large language models (LLMs) to limit leakage of training examples. While most DP research has focused on improving a model's privacy-utility tradeoff, some find that DP can be unfair to or biased against underrepresented groups. In this work, we show the impact of DP on bias in LLMs through empirical analysis. Differentially private training can increase the model bias against protected groups w.r.t AUC-based bias metrics. DP makes it more difficult for the model to differentiate between the positive and negative examples from the protected groups and other groups in the rest of the population. Our results also show that the impact of DP on bias is not only affected by the privacy protection level but also the underlying distribution of the dataset.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes