Anonymized Histograms in Intermediate Privacy Models
This work provides a simple and effective method for private data analysis in intermediate privacy models, with applications to estimating symmetric distribution properties like entropy and support size.
The authors tackled the problem of privately computing anonymized histograms in shuffle and pan-private differential privacy models, achieving a nearly optimal error bound of ˜O_ε(√n) by post-processing a discrete Laplace-noised histogram.
We study the problem of privately computing the anonymized histogram (a.k.a. unattributed histogram), which is defined as the histogram without item labels. Previous works have provided algorithms with $\ell_1$- and $\ell_2^2$-errors of $O_\varepsilon(\sqrt{n})$ in the central model of differential privacy (DP). In this work, we provide an algorithm with a nearly matching error guarantee of $\tilde{O}_\varepsilon(\sqrt{n})$ in the shuffle DP and pan-private models. Our algorithm is very simple: it just post-processes the discrete Laplace-noised histogram! Using this algorithm as a subroutine, we show applications in privately estimating symmetric properties of distributions such as entropy, support coverage, and support size.