CRLGFeb 27, 2020

Membership Inference Attacks and Defenses in Classification Models

arXiv:2002.12062v353 citations
AI Analysis

This addresses privacy risks in machine learning models for users and organizations, presenting an incremental advance in defense mechanisms.

The paper tackles membership inference attacks on classifiers by linking vulnerability to the generalization gap and proposes a defense that reduces training accuracy to match validation accuracy using a set regularizer, achieving significant improvement in defense with minimal impact on testing accuracy.

We study the membership inference (MI) attack against classifiers, where the attacker's goal is to determine whether a data instance was used for training the classifier. Through systematic cataloging of existing MI attacks and extensive experimental evaluations of them, we find that a model's vulnerability to MI attacks is tightly related to the generalization gap -- the difference between training accuracy and test accuracy. We then propose a defense against MI attacks that aims to close the gap by intentionally reduces the training accuracy. More specifically, the training process attempts to match the training and validation accuracies, by means of a new {\em set regularizer} using the Maximum Mean Discrepancy between the softmax output empirical distributions of the training and validation sets. Our experimental results show that combining this approach with another simple defense (mix-up training) significantly improves state-of-the-art defense against MI attacks, with minimal impact on testing accuracy.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes