LGFeb 24, 2022

Bounding Membership Inference

Anvith Thudi, Ilia Shumailov, Franziska Boenisch, Nicolas Papernot

arXiv:2202.12232v413.020 citations

Originality Incremental advance

AI Analysis

This work addresses the accuracy-privacy trade-off for entities with large datasets, offering a practical improvement over existing DP methods.

The paper tackles the lack of theoretical understanding of why differential privacy reduces membership inference attacks by providing a tighter bound on attack precision for DP-trained models, and introduces a subsampling scheme that reduces this bound, allowing looser DP guarantees and improving model accuracy on benchmarks like MNIST and CIFAR10.

Differential Privacy (DP) is the de facto standard for reasoning about the privacy guarantees of a training algorithm. Despite the empirical observation that DP reduces the vulnerability of models to existing membership inference (MI) attacks, a theoretical underpinning as to why this is the case is largely missing in the literature. In practice, this means that models need to be trained with DP guarantees that greatly decrease their accuracy. In this paper, we provide a tighter bound on the positive accuracy (i.e., attack precision) of any MI adversary when a training algorithm provides $(\varepsilon, δ)$-DP. Our bound informs the design of a novel privacy amplification scheme: an effective training set is sub-sampled from a larger set prior to the beginning of training. We find this greatly reduces the bound on MI positive accuracy. As a result, our scheme allows the use of looser DP guarantees to limit the success of any MI adversary; this ensures that the model's accuracy is less impacted by the privacy guarantee. While this clearly benefits entities working with far more data than they need to train on, it can also improve the accuracy-privacy trade-off on benchmarks studied in the academic literature. Consequently, we also find that subsampling decreases the effectiveness of a state-of-the-art MI attack (LiRA) much more effectively than training with stronger DP guarantees on MNIST and CIFAR10. We conclude by discussing implications of our MI bound on the field of machine unlearning.

View on arXiv PDF

Similar