Correlated Noise Provably Beats Independent Noise for Differentially Private Learning
This work addresses the challenge of balancing privacy and utility in machine learning for applications requiring data protection, offering an efficient method with provable improvements over standard approaches.
The paper tackles the problem of improving utility in differentially private learning by using correlated noise instead of independent noise, showing that correlated noise provably outperforms DP-SGD based on parameters like effective dimension and condition number, with experiments validating these gains in private deep learning.
Differentially private learning algorithms inject noise into the learning process. While the most common private learning algorithm, DP-SGD, adds independent Gaussian noise in each iteration, recent work on matrix factorization mechanisms has shown empirically that introducing correlations in the noise can greatly improve their utility. We characterize the asymptotic learning utility for any choice of the correlation function, giving precise analytical bounds for linear regression and as the solution to a convex program for general convex functions. We show, using these bounds, how correlated noise provably improves upon vanilla DP-SGD as a function of problem parameters such as the effective dimension and condition number. Moreover, our analytical expression for the near-optimal correlation function circumvents the cubic complexity of the semi-definite program used to optimize the noise correlation matrix in previous work. We validate our theory with experiments on private deep learning. Our work matches or outperforms prior work while being efficient both in terms of compute and memory.