Differentially Private Distributed Bayesian Linear Regression with MCMC
This work addresses privacy-preserving distributed data analysis for multiple parties, but it is incremental as it builds on existing Bayesian and differential privacy methods.
The paper tackles the problem of performing Bayesian linear regression in a distributed setting while ensuring differential privacy, by developing a novel generative model for privately shared summary statistics and using MCMC algorithms, resulting in well-rounded estimation and prediction as demonstrated on real and simulated data.
We propose a novel Bayesian inference framework for distributed differentially private linear regression. We consider a distributed setting where multiple parties hold parts of the data and share certain summary statistics of their portions in privacy-preserving noise. We develop a novel generative statistical model for privately shared statistics, which exploits a useful distributional relation between the summary statistics of linear regression. Bayesian estimation of the regression coefficients is conducted mainly using Markov chain Monte Carlo algorithms, while we also provide a fast version to perform Bayesian estimation in one iteration. The proposed methods have computational advantages over their competitors. We provide numerical results on both real and simulated data, which demonstrate that the proposed algorithms provide well-rounded estimation and prediction.