MLCRLGEMMEJul 26, 2022

Differentially Private Estimation via Statistical Depth

arXiv:2207.12602v14 citationsh-index: 2
Originality Incremental advance
AI Analysis

This work addresses the problem of enhancing privacy in statistical estimation for data analysts, offering a novel approach that reduces the need for prior bounds, though it is incremental in building upon existing DP and statistical depth concepts.

The paper tackled the challenge of constructing differentially private estimators without requiring exogenous bounds on data or estimates, by leveraging statistical depth measures to easily analyze and minimize the influence of individual observations. It introduced new approximate DP location and regression estimators using halfspace and regression depth, with simulations showing favorable performance compared to existing methods for sample sizes above 100-200 or with high privacy budgets.

Constructing a differentially private (DP) estimator requires deriving the maximum influence of an observation, which can be difficult in the absence of exogenous bounds on the input data or the estimator, especially in high dimensional settings. This paper shows that standard notions of statistical depth, i.e., halfspace depth and regression depth, are particularly advantageous in this regard, both in the sense that the maximum influence of a single observation is easy to analyze and that this value is typically low. This is used to motivate new approximate DP location and regression estimators using the maximizers of these two notions of statistical depth. A more computationally efficient variant of the approximate DP regression estimator is also provided. Also, to avoid requiring that users specify a priori bounds on the estimates and/or the observations, variants of these DP mechanisms are described that satisfy random differential privacy (RDP), which is a relaxation of differential privacy provided by Hall, Wasserman, and Rinaldo (2013). We also provide simulations of the two DP regression methods proposed here. The proposed estimators appear to perform favorably relative to the existing DP regression methods we consider in these simulations when either the sample size is at least 100-200 or the privacy-loss budget is sufficiently high.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes