MLCRLGSep 19, 2023

DPpack: An R Package for Differentially Private Statistical Analysis and Machine Learning

arXiv:2309.10965v11 citationsh-index: 57Has Code
Originality Synthesis-oriented
AI Analysis

This package addresses the need for accessible privacy-preserving tools in data analysis, but it is incremental as it implements existing mechanisms and models in a user-friendly format.

The authors tackled the challenge of implementing differential privacy in statistical analysis and machine learning by developing DPpack, an open-source R package that provides a toolkit of differentially private functions, including descriptive statistics and models like logistic regression, SVM, and linear regression, with support for Laplace, Gaussian, and exponential mechanisms.

Differential privacy (DP) is the state-of-the-art framework for guaranteeing privacy for individuals when releasing aggregated statistics or building statistical/machine learning models from data. We develop the open-source R package DPpack that provides a large toolkit of differentially private analysis. The current version of DPpack implements three popular mechanisms for ensuring DP: Laplace, Gaussian, and exponential. Beyond that, DPpack provides a large toolkit of easily accessible privacy-preserving descriptive statistics functions. These include mean, variance, covariance, and quantiles, as well as histograms and contingency tables. Finally, DPpack provides user-friendly implementation of privacy-preserving versions of logistic regression, SVM, and linear regression, as well as differentially private hyperparameter tuning for each of these models. This extensive collection of implemented differentially private statistics and models permits hassle-free utilization of differential privacy principles in commonly performed statistical analysis. We plan to continue developing DPpack and make it more comprehensive by including more differentially private machine learning techniques, statistical modeling and inference in the future.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes