LGCRMLApr 1, 2024

SoK: A Review of Differentially Private Linear Models For High-Dimensional Data

arXiv:2404.01141v16 citationsh-index: 142024 IEEE Conference on Secure and Trustworthy Machine Learning (SaTML)
Originality Synthesis-oriented
AI Analysis

This is an incremental review that synthesizes existing methods to guide future research in privacy-preserving machine learning for high-dimensional data.

The paper tackles the lack of systematic comparison among optimization methods for differentially private linear models in high-dimensional data, finding through empirical tests that robust and coordinate-optimized algorithms perform best.

Linear models are ubiquitous in data science, but are particularly prone to overfitting and data memorization in high dimensions. To guarantee the privacy of training data, differential privacy can be used. Many papers have proposed optimization techniques for high-dimensional differentially private linear models, but a systematic comparison between these methods does not exist. We close this gap by providing a comprehensive review of optimization methods for private high-dimensional linear models. Empirical tests on all methods demonstrate robust and coordinate-optimized algorithms perform best, which can inform future research. Code for implementing all methods is released online.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes