LG AI CRJul 7, 2023

Scalable Membership Inference Attacks via Quantile Regression

Martin Bertran, Shuai Tang, Michael Kearns, Jamie Morgenstern, Aaron Roth, Zhiwei Steven Wu

Amazon

arXiv:2307.03694v127.186 citationsh-index: 60

Originality Highly original

AI Analysis

This addresses the problem of scalable privacy assessment for machine learning models, offering a more efficient solution for researchers and practitioners concerned with data leakage.

The paper tackles the computational expense of membership inference attacks by introducing a new attack based on quantile regression, which is competitive with state-of-the-art methods while requiring significantly less compute and no knowledge of the model architecture.

Membership inference attacks are designed to determine, using black box access to trained models, whether a particular example was used in training or not. Membership inference can be formalized as a hypothesis testing problem. The most effective existing attacks estimate the distribution of some test statistic (usually the model's confidence on the true label) on points that were (and were not) used in training by training many \emph{shadow models} -- i.e. models of the same architecture as the model being attacked, trained on a random subsample of data. While effective, these attacks are extremely computationally expensive, especially when the model under attack is large. We introduce a new class of attacks based on performing quantile regression on the distribution of confidence scores induced by the model under attack on points that are not used in training. We show that our method is competitive with state-of-the-art shadow model attacks, while requiring substantially less compute because our attack requires training only a single model. Moreover, unlike shadow model attacks, our proposed attack does not require any knowledge of the architecture of the model under attack and is therefore truly ``black-box". We show the efficacy of this approach in an extensive series of experiments on various datasets and model architectures.

View on arXiv PDF

Similar