DS CRNov 5, 2021

Tight Bounds for Differentially Private Anonymized Histograms

arXiv:2111.03257v13.36 citations

Originality Synthesis-oriented

AI Analysis

This work refines theoretical bounds for differentially private anonymized histograms, which is incremental but important for privacy-preserving data analysis.

The paper tackles the problem of computing anonymized histograms under differential privacy, providing an algorithm with expected ℓ₁-error of O(√n / e^ε) for low privacy (ε ≥ 1) and a lower bound of Ω(√(n log(1/ε) / ε)) for high privacy (ε < 1).

In this note, we consider the problem of differentially privately (DP) computing an anonymized histogram, which is defined as the multiset of counts of the input dataset (without bucket labels). In the low-privacy regime $ε\geq 1$, we give an $ε$-DP algorithm with an expected $\ell_1$-error bound of $O(\sqrt{n} / e^ε)$. In the high-privacy regime $ε< 1$, we give an $Ω(\sqrt{n \log(1/ε) / ε})$ lower bound on the expected $\ell_1$ error. In both cases, our bounds asymptotically match the previously known lower/upper bounds due to [Suresh, NeurIPS 2019].

View on arXiv PDF

Similar