STMEMLOct 31, 2020

Strongly universally consistent nonparametric regression and classification with privatised data

arXiv:2011.00216v119 citations
AI Analysis

This work addresses privacy-preserving statistical learning for data analysts, providing a rigorous solution for handling sensitive data with theoretical guarantees, though it is incremental as it adapts an existing estimator to a privacy setting.

The paper tackles nonparametric regression and classification under local differential privacy constraints by adding Laplace noise to discretized data and designing a privatized version of the partitioning regression estimator, achieving strong universal consistency for both tasks.

In this paper we revisit the classical problem of nonparametric regression, but impose local differential privacy constraints. Under such constraints, the raw data $(X_1,Y_1),\ldots,(X_n,Y_n)$, taking values in $\mathbb{R}^d \times \mathbb{R}$, cannot be directly observed, and all estimators are functions of the randomised output from a suitable privacy mechanism. The statistician is free to choose the form of the privacy mechanism, and here we add Laplace distributed noise to a discretisation of the location of a feature vector $X_i$ and to the value of its response variable $Y_i$. Based on this randomised data, we design a novel estimator of the regression function, which can be viewed as a privatised version of the well-studied partitioning regression estimator. The main result is that the estimator is strongly universally consistent. Our methods and analysis also give rise to a strongly universally consistent binary classification rule for locally differentially private data.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes