Improving Deep Learning with Differential Privacy using Gradient Encoding and Denoising
This work addresses privacy leakage in deep learning for applications requiring data confidentiality, offering an incremental improvement over existing DP methods.
The paper tackles the problem of training deep learning models with differential privacy (DP) guarantees while minimizing accuracy loss, achieving a privacy bound of ε=3.2 compared to ε=6 for DPSGD at the same accuracy of 96.1% on MNIST.
Deep learning models leak significant amounts of information about their training datasets. Previous work has investigated training models with differential privacy (DP) guarantees through adding DP noise to the gradients. However, such solutions (specifically, DPSGD), result in large degradations in the accuracy of the trained models. In this paper, we aim at training deep learning models with DP guarantees while preserving model accuracy much better than previous works. Our key technique is to encode gradients to map them to a smaller vector space, therefore enabling us to obtain DP guarantees for different noise distributions. This allows us to investigate and choose noise distributions that best preserve model accuracy for a target privacy budget. We also take advantage of the post-processing property of differential privacy by introducing the idea of denoising, which further improves the utility of the trained models without degrading their DP guarantees. We show that our mechanism outperforms the state-of-the-art DPSGD; for instance, for the same model accuracy of $96.1\%$ on MNIST, our technique results in a privacy bound of $ε=3.2$ compared to $ε=6$ of DPSGD, which is a significant improvement.