LGAICRNov 20, 2023

Can we infer the presence of Differential Privacy in Deep Learning models' weights? Towards more secure Deep Learning

arXiv:2311.11717v1h-index: 18Has Code
Originality Incremental advance
AI Analysis

This addresses a security gap for users and regulators who need to certify DP in shared models, though it is incremental as it builds on existing DP-SGD methods.

The paper tackles the problem of verifying whether a Deep Learning model has been trained with Differential Privacy (DP), as current methods rely on trusting the model provider, which is problematic under data privacy regulations. They propose inferring DP presence from model weights using a meta-classifier, achieving results that remove the need for a trusted provider and establish a foundation for this research area.

Differential Privacy (DP) is a key property to protect data and models from integrity attacks. In the Deep Learning (DL) field, it is commonly implemented through the Differentially Private Stochastic Gradient Descent (DP-SGD). However, when a model is shared or released, there is no way to check whether it is differentially private, that is, it required to trust the model provider. This situation poses a problem when data privacy is mandatory, specially with current data regulations, as the presence of DP can not be certificated consistently by any third party. Thus, we face the challenge of determining whether a DL model has been trained with DP, according to the title question: Can we infer the presence of Differential Privacy in Deep Learning models' weights? Since the DP-SGD significantly changes the training process of a DL model, we hypothesize that DP leaves an imprint in the weights of a DL model, which can be used to predict whether a model has been trained with DP regardless of its architecture and the training dataset. In this paper, we propose to employ the imprint in model weights of using DP to infer the presence of DP training in a DL model. To substantiate our hypothesis, we developed an experimental methodology based on two datasets of weights of DL models, each with models with and without DP training and a meta-classifier to infer whether DP was used in the training process of a DL model, by accessing its weights. We accomplish both, the removal of the requirement of a trusted model provider and a strong foundation for this interesting line of research. Thus, our contribution is an additional layer of security on top of the strict private requirements of DP training in DL models, towards to DL models.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes