LGCRMay 7

Trade-off Functions for DP-SGD with Subsampling based on Random Shuffling: Tight Upper and Lower Bounds

arXiv:2605.0625920.8
AI Analysis

This work provides the first tight, interpretable analysis of DP-SGD with random shuffling, addressing a key gap in differential privacy theory for practical deep learning.

The paper provides tight upper and lower bounds on the trade-off function for DP-SGD with random shuffling subsampling, yielding closed-form expressions. For a single epoch, they show that with noise multiplier σ=1 and δ=1/100, roughly 1.14×10^6 rounds and 1.14×10^7 samples suffice to achieve meaningful DP, and they derive asymptotic results for multiple epochs.

We derive a tight analysis of the trade-off function for Differentially Private Stochastic Gradient Descent (DP-SGD) with subsampling based on random shuffling within the $f$-DP framework. Our analysis covers the regime $σ\geq \sqrt{3/\ln M}$, where $σ$ is the noise multiplier and $M$ is the number of rounds within a single epoch. Unlike $f$-DP analyses for Poisson subsampling, which yield non-closed implicit formulas that can be machine computed but are non-transparent, random shuffling admits a tight analysis yielding transparent and interpretable closed-form bounds. Our concrete bounds, derived via the Berry-Esseen theorem, are tight up to constant factors within the proof framework. We demonstrate worked parameter settings for a single epoch ($E=1$) with a corresponding trade-off function $\geq 1-a-δ$, that is, only $δ$ below the ideal random guessing diagonal $1-a$: For $δ= 1/100$ and $σ= 1$, roughly $M \approx 1.14\times 10^6$ rounds and $N \approx 1.14\times 10^7$ training samples suffice to achieve meaningful differential privacy. This is in contrast to recent negative results for the regime $σ\leq 1/\sqrt{2 \ln M}$. Our concrete bounds can be composed over multiple epochs leading to $δ$ having a linear in $E$ dependency, which restricts $E=O(\sqrt{M})$. To go beyond Berry--Esseen, we introduce a new proof technique based on a generalization of the law of large numbers that yields an asymptotic random guessing diagonal-limit result: if $E=c_M^2M$ with $c_M\to 0$, then the $E$-fold composed trade-off function satisfies $f^{\otimes E}(a)\to 1-a$ uniformly in $a\in[0,1]$ with $δ$ having only an $O(\sqrt{E})$ dependency. We compare this asymptotic regime with the corresponding Poisson subsampling asymptotic, and highlight the characterization of explicit convergence rates as an open question.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes