STCRLGMLAug 28, 2020

Deconvoluting Kernel Density Estimation and Regression for Locally Differentially Private Data

arXiv:2008.12466v210 citations
AI Analysis

This addresses a challenge for social scientists using differential privacy in applications like the 2020 U.S. Census, though it is incremental as it adapts existing deconvolution and regression frameworks to the privacy context.

The paper tackles the problem of distorted probability density in locally differentially private data due to additive noise, which can lead to under/over-estimation of heavy-hitters, by developing deconvoluting kernel density estimators and regression models to remove noise effects, demonstrating performance on financial and demographic datasets.

Local differential privacy has become the gold-standard of privacy literature for gathering or releasing sensitive individual data points in a privacy-preserving manner. However, locally differential data can twist the probability density of the data because of the additive noise used to ensure privacy. In fact, the density of privacy-preserving data (no matter how many samples we gather) is always flatter in comparison with the density function of the original data points due to convolution with privacy-preserving noise density function. The effect is especially more pronounced when using slow-decaying privacy-preserving noises, such as the Laplace noise. This can result in under/over-estimation of the heavy-hitters. This is an important challenge facing social scientists due to the use of differential privacy in the 2020 Census in the United States. In this paper, we develop density estimation methods using smoothing kernels. We use the framework of deconvoluting kernel density estimators to remove the effect of privacy-preserving noise. This approach also allows us to adapt the results from non-parameteric regression with errors-in-variables to develop regression models based on locally differentially private data. We demonstrate the performance of the developed methods on financial and demographic datasets.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes