LG MLJun 15, 2019

Membership Privacy for Machine Learning Models Through Knowledge Transfer

arXiv:1906.06589v321.727 citations

Originality Incremental advance

AI Analysis

This addresses privacy concerns for users of machine learning models by providing a more effective defense against membership inference attacks, though it is incremental as it builds on existing knowledge distillation techniques.

The paper tackles membership inference attacks on machine learning models by proposing a defense called distillation for membership privacy (DMP), which uses knowledge distillation to improve privacy-utility tradeoffs, achieving up to 100% accuracy improvement over prior methods while maintaining similar privacy levels.

Large capacity machine learning (ML) models are prone to membership inference attacks (MIAs), which aim to infer whether the target sample is a member of the target model's training dataset. The serious privacy concerns due to the membership inference have motivated multiple defenses against MIAs, e.g., differential privacy and adversarial regularization. Unfortunately, these defenses produce ML models with unacceptably low classification performances. Our work proposes a new defense, called distillation for membership privacy (DMP), against MIAs that preserves the utility of the resulting models significantly better than prior defenses. DMP leverages knowledge distillation to train ML models with membership privacy. We provide a novel criterion to tune the data used for knowledge transfer in order to amplify the membership privacy of DMP. Our extensive evaluation shows that DMP provides significantly better tradeoffs between membership privacy and classification accuracies compared to state-of-the-art MIA defenses. For instance, DMP achieves ~100% accuracy improvement over adversarial regularization for DenseNet trained on CIFAR100, for similar membership privacy (measured using MIA risk): when the MIA risk is 53.7%, adversarially regularized DenseNet is 33.6% accurate, while DMP-trained DenseNet is 65.3% accurate.

View on arXiv PDF

Similar