Improving Differentially Private Models with Active Learning
This work addresses privacy concerns for deploying machine learning models on sensitive data, offering an incremental improvement over existing DP techniques.
The paper tackled the performance degradation of differentially private neural networks by fine-tuning them with active learning on public data, achieving improved state-of-the-art accuracy on MNIST and SVHN datasets while maintaining privacy guarantees.
Broad adoption of machine learning techniques has increased privacy concerns for models trained on sensitive data such as medical records. Existing techniques for training differentially private (DP) models give rigorous privacy guarantees, but applying these techniques to neural networks can severely degrade model performance. This performance reduction is an obstacle to deploying private models in the real world. In this work, we improve the performance of DP models by fine-tuning them through active learning on public data. We introduce two new techniques - DIVERSEPUBLIC and NEARPRIVATE - for doing this fine-tuning in a privacy-aware way. For the MNIST and SVHN datasets, these techniques improve state-of-the-art accuracy for DP models while retaining privacy guarantees.