On the Privacy of Selection Mechanisms with Gaussian Noise
This work addresses privacy concerns in data analysis for applications like mobility and energy consumption, offering incremental improvements in privacy accounting and mechanism design.
The paper tackles the problem of providing pure differential privacy guarantees for selection mechanisms like Report Noisy Max and Above Threshold when using Gaussian noise, which previously only offered approximate guarantees. It shows that under bounded query assumptions, tight pure DP bounds can be derived, leading to empirically tighter privacy accounting in high-privacy, low-data regimes and a competitive Gaussian Sparse Vector Technique requiring less hyper-parameter tuning.
Report Noisy Max and Above Threshold are two classical differentially private (DP) selection mechanisms. Their output is obtained by adding noise to a sequence of low-sensitivity queries and reporting the identity of the query whose (noisy) answer satisfies a certain condition. Pure DP guarantees for these mechanisms are easy to obtain when Laplace noise is added to the queries. On the other hand, when instantiated using Gaussian noise, standard analyses only yield approximate DP guarantees despite the fact that the outputs of these mechanisms lie in a discrete space. In this work, we revisit the analysis of Report Noisy Max and Above Threshold with Gaussian noise and show that, under the additional assumption that the underlying queries are bounded, it is possible to provide pure ex-ante DP bounds for Report Noisy Max and pure ex-post DP bounds for Above Threshold. The resulting bounds are tight and depend on closed-form expressions that can be numerically evaluated using standard methods. Empirically we find these lead to tighter privacy accounting in the high privacy, low data regime. Further, we propose a simple privacy filter for composing pure ex-post DP guarantees, and use it to derive a fully adaptive Gaussian Sparse Vector Technique mechanism. Finally, we provide experiments on mobility and energy consumption datasets demonstrating that our Sparse Vector Technique is practically competitive with previous approaches and requires less hyper-parameter tuning.