LGAIMLSep 17, 2020

An Extension of Fano's Inequality for Characterizing Model Susceptibility to Membership Inference Attacks

arXiv:2009.08097v18 citations
AI Analysis

This work addresses privacy risks in machine learning by providing a theoretical and empirical measure for model susceptibility to attacks, though it is incremental as it builds on existing Fano's inequality.

The authors tackled the problem of membership inference attacks on deep neural networks, which can leak private training data, by extending Fano's inequality to theoretically bound attack success probability using mutual information between inputs and activations, and empirically showed high correlations (e.g., 0.996 on SVHN) between mutual information and model susceptibility.

Deep neural networks have been shown to be vulnerable to membership inference attacks wherein the attacker aims to detect whether specific input data were used to train the model. These attacks can potentially leak private or proprietary data. We present a new extension of Fano's inequality and employ it to theoretically establish that the probability of success for a membership inference attack on a deep neural network can be bounded using the mutual information between its inputs and its activations. This enables the use of mutual information to measure the susceptibility of a DNN model to membership inference attacks. In our empirical evaluation, we show that the correlation between the mutual information and the susceptibility of the DNN model to membership inference attacks is 0.966, 0.996, and 0.955 for CIFAR-10, SVHN and GTSRB models, respectively.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes