LG CR MLJun 7, 2024

Auditing Differential Privacy Guarantees Using Density Estimation

arXiv:2406.04827v310.49 citationsh-index: 23

Originality Incremental advance

AI Analysis

This work addresses the challenge of verifying privacy guarantees in machine learning models, particularly for subsampled Gaussian mechanisms, which is incremental but important for ensuring robust privacy compliance.

The paper tackles the problem of auditing differential privacy guarantees without requiring prior knowledge of the mechanism's parameters, using a histogram-based density estimation method to find lower bounds for statistical distances, and shows improvements over existing methods like f-DP auditing.

We present a novel method for accurately auditing the differential privacy (DP) guarantees of DP mechanisms. In particular, our solution is applicable to auditing DP guarantees of machine learning (ML) models. Previous auditing methods tightly capture the privacy guarantees of DP-SGD trained models in the white-box setting where the auditor has access to all intermediate models; however, the success of these methods depends on a priori information about the parametric form of the noise and the subsampling ratio used for sampling the gradients. We present a method that does not require such information and is agnostic to the randomization used for the underlying mechanism. Similarly to several previous DP auditing methods, we assume that the auditor has access to a set of independent observations from two one-dimensional distributions corresponding to outputs from two neighbouring datasets. Furthermore, our solution is based on a simple histogram-based density estimation technique to find lower bounds for the statistical distance between these distributions when measured using the hockey-stick divergence. We show that our approach also naturally generalizes the previously considered class of threshold membership inference auditing methods. We improve upon accurate auditing methods such as the $f$-DP auditing. Moreover, we address an open problem on how to accurately audit the subsampled Gaussian mechanism without any knowledge of the parameters of the underlying mechanism.

View on arXiv PDF

Similar