LGCRSTMLJan 30, 2023

Near Optimal Private and Robust Linear Regression

arXiv:2301.13273v112 citationsh-index: 58
Originality Highly original
AI Analysis

This addresses privacy and security concerns in statistical estimation for data analysts, though it is incremental as it builds on DP-SGD.

The paper tackles linear regression under differential privacy with adversarial label corruption, achieving near-optimal sample complexity without corruption and being the first efficient algorithm to guarantee both privacy and robustness.

We study the canonical statistical estimation problem of linear regression from $n$ i.i.d.~examples under $(\varepsilon,δ)$-differential privacy when some response variables are adversarially corrupted. We propose a variant of the popular differentially private stochastic gradient descent (DP-SGD) algorithm with two innovations: a full-batch gradient descent to improve sample complexity and a novel adaptive clipping to guarantee robustness. When there is no adversarial corruption, this algorithm improves upon the existing state-of-the-art approach and achieves a near optimal sample complexity. Under label-corruption, this is the first efficient linear regression algorithm to guarantee both $(\varepsilon,δ)$-DP and robustness. Synthetic experiments confirm the superiority of our approach.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes