CRMay 15

Rethinking the Security of DP-SGD: A Corrected Analysis of Differentially Private Machine Learning

Wenhao Wang, Shujie Cui, Hui Cui, Xingliang Yuan

arXiv:2605.156489.2

Predicted impact top 40% in CR · last 90 daysOriginality Incremental advance

AI Analysis

For practitioners and researchers using DP-SGD, this work reveals that commonly used implementations may provide weaker privacy than claimed, necessitating corrected analyses.

The paper identifies a mismatch between the formal privacy analysis of DP-SGD (based on the Subsampled Gaussian Mechanism) and common implementations that use averaging steps, showing that actual privacy guarantees can be weaker. Empirical audits of four implementations, including Opacus, reveal leakage beyond reported guarantees, and a corrected guarantee is derived for the latest Opacus version.

Differentially Private Stochastic Gradient Descent (DP-SGD) is widely used to protect training data in machine learning. Its privacy guarantee is commonly analyzed through a security game in which an adversary infers whether a target record is included in the training dataset from the mechanism output. The resulting privacy leakage is characterized by a privacy curve, which reports the false negative rate as a function of the false positive rate. We identify a mismatch between this formal analysis and common DP-SGD implementations. Existing analyses often model DP-SGD and its variants as the Subsampled Gaussian Mechanism (SGM), where Gaussian noise is added to the sum of clipped gradients computed from a Poisson-sampled batch. In practice, however, many implementations apply an additional normalization step: the noisy gradient sum is divided either by the expected batch size or by the sampled batch size. These mechanisms are therefore better formalized as the Expected-Averaged SGM (EASGM) or the Batch-Averaged SGM (ASGM), respectively. We re-analyze the privacy guarantees of DP-SGD under the EASGM and ASGM formulations. Our theoretical results show that these guarantees can be weaker than the standard SGM-based guarantee, implying that the true privacy leakage may exceed the reported guarantee in some regimes. We further audit four state-of-the-art DP-SGD implementations, including Meta's Opacus library, and observe empirical leakage beyond the SGM-based guarantees. Finally, we audit Opacus versions v0.9.0 to v1.5.4 and derive a corrected privacy guarantee for the latest implementation.

View on arXiv PDF

Similar