Communication Complexity in Locally Private Distribution Estimation and Heavy Hitters
This work addresses privacy and communication efficiency in data analysis, providing optimal algorithms for distribution and heavy hitter estimation, which is incremental in combining these constraints.
The paper tackles the problem of distribution estimation and heavy hitter estimation under local differential privacy and communication constraints, proposing a sample-optimal scheme for distribution estimation with one-bit communication and showing that Hadamard Response is optimal for heavy hitters, while proving a lower bound on communication bits for heavy hitter estimation without public randomness.
We consider the problems of distribution estimation and heavy hitter (frequency) estimation under privacy and communication constraints. While these constraints have been studied separately, optimal schemes for one are sub-optimal for the other. We propose a sample-optimal $\varepsilon$-locally differentially private (LDP) scheme for distribution estimation, where each user communicates only one bit, and requires no public randomness. We show that Hadamard Response, a recently proposed scheme for $\varepsilon$-LDP distribution estimation is also utility-optimal for heavy hitter estimation. Finally, we show that unlike distribution estimation, without public randomness where only one bit suffices, any heavy hitter estimation algorithm that communicates $o(\min \{\log n, \log k\})$ bits from each user cannot be optimal.