CL AI LGOct 24, 2024

Does Differential Privacy Impact Bias in Pretrained NLP Models?

Md. Khairul Islam, Andrew Wang, Tianhao Wang, Yangfeng Ji, Judy Fox, Jieyu Zhao

arXiv:2410.18749v12.72 citationsh-index: 2Has Code

Originality Incremental advance

AI Analysis

This addresses fairness concerns in privacy-preserving NLP for underrepresented groups, though it is incremental as it builds on prior findings about DP and bias.

The study investigated how differential privacy (DP) affects bias in pre-trained large language models (LLMs) during fine-tuning, finding that DP training increases bias against protected groups, as measured by AUC-based metrics, with the impact varying based on privacy levels and dataset distribution.

Differential privacy (DP) is applied when fine-tuning pre-trained large language models (LLMs) to limit leakage of training examples. While most DP research has focused on improving a model's privacy-utility tradeoff, some find that DP can be unfair to or biased against underrepresented groups. In this work, we show the impact of DP on bias in LLMs through empirical analysis. Differentially private training can increase the model bias against protected groups w.r.t AUC-based bias metrics. DP makes it more difficult for the model to differentiate between the positive and negative examples from the protected groups and other groups in the rest of the population. Our results also show that the impact of DP on bias is not only affected by the privacy protection level but also the underlying distribution of the dataset.

View on arXiv PDF Code

Similar