SoK: A Review of Differentially Private Linear Models For High-Dimensional Data
This is an incremental review that synthesizes existing methods to guide future research in privacy-preserving machine learning for high-dimensional data.
The paper tackles the lack of systematic comparison among optimization methods for differentially private linear models in high-dimensional data, finding through empirical tests that robust and coordinate-optimized algorithms perform best.
Linear models are ubiquitous in data science, but are particularly prone to overfitting and data memorization in high dimensions. To guarantee the privacy of training data, differential privacy can be used. Many papers have proposed optimization techniques for high-dimensional differentially private linear models, but a systematic comparison between these methods does not exist. We close this gap by providing a comprehensive review of optimization methods for private high-dimensional linear models. Empirical tests on all methods demonstrate robust and coordinate-optimized algorithms perform best, which can inform future research. Code for implementing all methods is released online.