Estimating Numerical Distributions under Local Differential Privacy
This work addresses privacy-preserving data collection for users by improving accuracy in distribution estimation, though it is incremental as it builds on existing LDP protocols for categorical domains.
The paper tackles the problem of estimating numerical distributions under local differential privacy by introducing the square wave (SW) mechanism and an Expectation Maximization with Smoothing (EMS) algorithm, which consistently outperform existing methods in utility metrics.
When collecting information, local differential privacy (LDP) relieves the concern of privacy leakage from users' perspective, as user's private information is randomized before sent to the aggregator. We study the problem of recovering the distribution over a numerical domain while satisfying LDP. While one can discretize a numerical domain and then apply the protocols developed for categorical domains, we show that taking advantage of the numerical nature of the domain results in better trade-off of privacy and utility. We introduce a new reporting mechanism, called the square wave SW mechanism, which exploits the numerical nature in reporting. We also develop an Expectation Maximization with Smoothing (EMS) algorithm, which is applied to aggregated histograms from the SW mechanism to estimate the original distributions. Extensive experiments demonstrate that our proposed approach, SW with EMS, consistently outperforms other methods in a variety of utility metrics.