LGCRMEMLJul 10, 2020

Differentially Private Simple Linear Regression

arXiv:2007.05157v164 citations
Originality Incremental advance
AI Analysis

This addresses privacy concerns in economics and social science research for fine-grained analysis of sensitive data, though it is incremental as it adapts existing algorithms to a specific setting.

The paper tackles the challenge of performing differentially private simple linear regression on small datasets (tens to hundreds of datapoints), showing that robust estimators like Theil-Sen perform best for the smallest datasets while standard algorithms improve with larger sizes.

Economics and social science research often require analyzing datasets of sensitive personal information at fine granularity, with models fit to small subsets of the data. Unfortunately, such fine-grained analysis can easily reveal sensitive individual information. We study algorithms for simple linear regression that satisfy differential privacy, a constraint which guarantees that an algorithm's output reveals little about any individual input data record, even to an attacker with arbitrary side information about the dataset. We consider the design of differentially private algorithms for simple linear regression for small datasets, with tens to hundreds of datapoints, which is a particularly challenging regime for differential privacy. Focusing on a particular application to small-area analysis in economics research, we study the performance of a spectrum of algorithms we adapt to the setting. We identify key factors that affect their performance, showing through a range of experiments that algorithms based on robust estimators (in particular, the Theil-Sen estimator) perform well on the smallest datasets, but that other more standard algorithms do better as the dataset size increases.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes