An Empirical Study on the Intrinsic Privacy of SGD
This work addresses the utility-privacy trade-off in machine learning for practitioners by showing that SGD's randomness can enhance privacy, potentially improving efficiency in privacy-preserving systems.
The paper investigates whether the inherent randomness of stochastic gradient descent (SGD) contributes to privacy, potentially reducing the need for additional noise in differential privacy. Through a large-scale empirical study with over 120,000 models, it finds that the random seed impacts model weights more than individual training examples, with intrinsic privacy estimates as low as 1.9 in convex cases and reduced attack performance in non-convex settings.
Introducing noise in the training of machine learning systems is a powerful way to protect individual privacy via differential privacy guarantees, but comes at a cost to utility. This work looks at whether the inherent randomness of stochastic gradient descent (SGD) could contribute to privacy, effectively reducing the amount of \emph{additional} noise required to achieve a given privacy guarantee. We conduct a large-scale empirical study to examine this question. Training a grid of over 120,000 models across four datasets (tabular and images) on convex and non-convex objectives, we demonstrate that the random seed has a larger impact on model weights than any individual training example. We test the distribution over weights induced by the seed, finding that the simple convex case can be modelled with a multivariate Gaussian posterior, while neural networks exhibit multi-modal and non-Gaussian weight distributions. By casting convex SGD as a Gaussian mechanism, we then estimate an `intrinsic' data-dependent $ε_i(\mathcal{D})$, finding values as low as 6.3, dropping to 1.9 using empirical estimates. We use a membership inference attack to estimate $ε$ for non-convex SGD and demonstrate that hiding the random seed from the adversary results in a statistically significant reduction in attack performance, corresponding to a reduction in the effective $ε$. These results provide empirical evidence that SGD exhibits appreciable variability relative to its dataset sensitivity, and this `intrinsic noise' has the potential to be leveraged to improve the utility of privacy-preserving machine learning.