LG CROct 12, 2021

Not all noise is accounted equally: How differentially private learning benefits from large sampling rates

Friedrich Dörmann, Osvald Frisk, Lars Nørvang Andersen, Christian Fischer Pedersen

arXiv:2110.06255v111.331 citationsHas Code

Originality Incremental advance

AI Analysis

This work addresses privacy concerns in machine learning for applications handling sensitive data, offering an incremental improvement in the privacy/utility tradeoff.

The study tackled the problem of differentially private learning by showing that inherent sampling noise and additive Gaussian noise affect utility equivalently but are not equally accounted for in the privacy budget, and proposed a training paradigm that shifts noise proportions to improve the privacy/utility tradeoff, achieving state-of-the-art results for private CNNs.

Learning often involves sensitive data and as such, privacy preserving extensions to Stochastic Gradient Descent (SGD) and other machine learning algorithms have been developed using the definitions of Differential Privacy (DP). In differentially private SGD, the gradients computed at each training iteration are subject to two different types of noise. Firstly, inherent sampling noise arising from the use of minibatches. Secondly, additive Gaussian noise from the underlying mechanisms that introduce privacy. In this study, we show that these two types of noise are equivalent in their effect on the utility of private neural networks, however they are not accounted for equally in the privacy budget. Given this observation, we propose a training paradigm that shifts the proportions of noise towards less inherent and more additive noise, such that more of the overall noise can be accounted for in the privacy budget. With this paradigm, we are able to improve on the state-of-the-art in the privacy/utility tradeoff of private end-to-end CNNs.

View on arXiv PDF Code

Similar