CRLGJun 16, 2022

Differentially Private Multi-Party Data Release for Linear Regression

arXiv:2206.07998v22 citationsh-index: 80
Originality Incremental advance
AI Analysis

This addresses privacy concerns in multi-party data sharing for stakeholders, though it is incremental as it builds on existing DP techniques.

The paper tackles the problem of releasing data for linear regression across multiple parties with disjoint attributes while preserving differential privacy, and shows that their proposed method asymptotically converges to optimal non-private solutions with increasing dataset size.

Differentially Private (DP) data release is a promising technique to disseminate data without compromising the privacy of data subjects. However the majority of prior work has focused on scenarios where a single party owns all the data. In this paper we focus on the multi-party setting, where different stakeholders own disjoint sets of attributes belonging to the same group of data subjects. Within the context of linear regression that allow all parties to train models on the complete data without the ability to infer private attributes or identities of individuals, we start with directly applying Gaussian mechanism and show it has the small eigenvalue problem. We further propose our novel method and prove it asymptotically converges to the optimal (non-private) solutions with increasing dataset size. We substantiate the theoretical results through experiments on both artificial and real-world datasets.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes