Privacy Amplification for Matrix Mechanisms
This work addresses a key limitation for practitioners using state-of-the-art differential privacy algorithms in machine learning, enabling better privacy-utility trade-offs, though it is incremental in extending existing amplification concepts to new mechanisms.
The paper tackles the problem of extending privacy amplification via sampling to matrix mechanisms used in DP-FTRL, proposing the MMCC algorithm that provides nearly tight differential privacy guarantees and leads to significant improvements in privacy-utility trade-offs on standard benchmarks.
Privacy amplification exploits randomness in data selection to provide tighter differential privacy (DP) guarantees. This analysis is key to DP-SGD's success in machine learning, but, is not readily applicable to the newer state-of-the-art algorithms. This is because these algorithms, known as DP-FTRL, use the matrix mechanism to add correlated noise instead of independent noise as in DP-SGD. In this paper, we propose "MMCC", the first algorithm to analyze privacy amplification via sampling for any generic matrix mechanism. MMCC is nearly tight in that it approaches a lower bound as $ε\to0$. To analyze correlated outputs in MMCC, we prove that they can be analyzed as if they were independent, by conditioning them on prior outputs. Our "conditional composition theorem" has broad utility: we use it to show that the noise added to binary-tree-DP-FTRL can asymptotically match the noise added to DP-SGD with amplification. Our amplification algorithm also has practical empirical utility: we show it leads to significant improvement in the privacy-utility trade-offs for DP-FTRL algorithms on standard benchmarks.