Efficient Membership Inference Attacks by Bayesian Neural Network
This work addresses the problem of computational efficiency in privacy attacks for machine learning practitioners, though it appears incremental as it builds on existing MIA methods.
The paper tackles the computational overhead in Membership Inference Attacks (MIAs) by proposing BMIA, which uses Bayesian neural networks via Laplace approximation to estimate conditional score distributions with only one reference model, achieving efficient and powerful attacks across five datasets.
Membership Inference Attacks (MIAs) aim to estimate whether a specific data point was used in the training of a given model. Previous attacks often utilize multiple reference models to approximate the conditional score distribution, leading to significant computational overhead. While recent work leverages quantile regression to estimate conditional thresholds, it fails to capture epistemic uncertainty, resulting in bias in low-density regions. In this work, we propose a novel approach - Bayesian Membership Inference Attack (BMIA), which performs conditional attack through Bayesian inference. In particular, we transform a trained reference model into Bayesian neural networks by Laplace approximation, enabling the direct estimation of the conditional score distribution by probabilistic model parameters. Our method addresses both epistemic and aleatoric uncertainty with only a reference model, enabling efficient and powerful MIA. Extensive experiments on five datasets demonstrate the effectiveness and efficiency of BMIA.