CR AI LG MLFeb 4, 2021

ML-Doctor: Holistic Risk Assessment of Inference Attacks Against Machine Learning Models

Yugeng Liu, Rui Wen, Xinlei He, Ahmed Salem, Zhikun Zhang, Michael Backes, Emiliano De Cristofaro, Mario Fritz, Yang Zhang

arXiv:2102.02551v238.0167 citationsHas Code

Originality Incremental advance

AI Analysis

This work provides a comprehensive risk assessment for ML model owners and researchers concerned about the privacy and security implications of deploying machine learning models, offering insights into the interplay and effectiveness of various inference attacks and defenses.

This paper provides a holistic risk assessment of four inference attacks (membership inference, model inversion, attribute inference, and model stealing) against machine learning models. The study, conducted on five model architectures and four image datasets, found that training dataset complexity significantly impacts attack performance, and model stealing and membership inference attack effectiveness are negatively correlated. It also showed that defenses like DP-SGD and Knowledge Distillation only partially mitigate these attacks.

Inference attacks against Machine Learning (ML) models allow adversaries to learn sensitive information about training data, model parameters, etc. While researchers have studied, in depth, several kinds of attacks, they have done so in isolation. As a result, we lack a comprehensive picture of the risks caused by the attacks, e.g., the different scenarios they can be applied to, the common factors that influence their performance, the relationship among them, or the effectiveness of possible defenses. In this paper, we fill this gap by presenting a first-of-its-kind holistic risk assessment of different inference attacks against machine learning models. We concentrate on four attacks -- namely, membership inference, model inversion, attribute inference, and model stealing -- and establish a threat model taxonomy. Our extensive experimental evaluation, run on five model architectures and four image datasets, shows that the complexity of the training dataset plays an important role with respect to the attack's performance, while the effectiveness of model stealing and membership inference attacks are negatively correlated. We also show that defenses like DP-SGD and Knowledge Distillation can only mitigate some of the inference attacks. Our analysis relies on a modular re-usable software, ML-Doctor, which enables ML model owners to assess the risks of deploying their models, and equally serves as a benchmark tool for researchers and practitioners.

View on arXiv PDF Code

Similar