LGCRJun 1, 2023

Better Private Linear Regression Through Better Private Feature Selection

arXiv:2306.00920v16 citationsh-index: 18
Originality Incremental advance
AI Analysis

This work addresses practical usability issues in private linear regression for data analysts, though it appears incremental as it extends existing algorithms rather than introducing a fundamentally new paradigm.

The paper tackles the problem of differentially private linear regression in high-dimensional settings where users struggle to set data bounds without violating privacy, by introducing a private feature selection method based on Kendall rank correlation. Experiments across 25 datasets show this approach significantly broadens applicability with little additional privacy or computational cost.

Existing work on differentially private linear regression typically assumes that end users can precisely set data bounds or algorithmic hyperparameters. End users often struggle to meet these requirements without directly examining the data (and violating privacy). Recent work has attempted to develop solutions that shift these burdens from users to algorithms, but they struggle to provide utility as the feature dimension grows. This work extends these algorithms to higher-dimensional problems by introducing a differentially private feature selection method based on Kendall rank correlation. We prove a utility guarantee for the setting where features are normally distributed and conduct experiments across 25 datasets. We find that adding this private feature selection step before regression significantly broadens the applicability of ``plug-and-play'' private linear regression algorithms at little additional cost to privacy, computation, or decision-making by the end user.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes