LG CR MLOct 3, 2020

Differentially Private Representation for NLP: Formal Guarantee and An Empirical Study on Privacy and Fairness

arXiv:2010.01285v153.21009 citationsHas Code

Originality Incremental advance

AI Analysis

This addresses privacy risks in NLP for users of text-based AI systems, though it is incremental as it builds on existing differential privacy methods.

The paper tackles the problem of hidden representations in deep NLP models encoding private information by proposing Differentially Private Neural Representation (DPNR), which reduces privacy leakage by up to 30% on benchmark datasets while maintaining task performance within 5% degradation.

It has been demonstrated that hidden representation learned by a deep model can encode private information of the input, hence can be exploited to recover such information with reasonable accuracy. To address this issue, we propose a novel approach called Differentially Private Neural Representation (DPNR) to preserve the privacy of the extracted representation from text. DPNR utilises Differential Privacy (DP) to provide a formal privacy guarantee. Further, we show that masking words via dropout can further enhance privacy. To maintain utility of the learned representation, we integrate DP-noisy representation into a robust training process to derive a robust target model, which also helps for model fairness over various demographic variables. Experimental results on benchmark datasets under various parameter settings demonstrate that DPNR largely reduces privacy leakage without significantly sacrificing the main task performance.

View on arXiv PDF Code

Similar