Enhancing Differentially Private Linear Regression via Public Second-Moment
This work addresses utility limitations in differentially private linear regression for data analysts, representing an incremental improvement over existing methods.
The paper tackles the problem of utility degradation in differentially private linear regression by proposing a method that uses public second-moment data to transform private data, improving estimator accuracy and robustness. Experiments on synthetic and real-world datasets show the method's effectiveness, with concrete error bounds derived to quantify the gains.
Leveraging information from public data has become increasingly crucial in enhancing the utility of differentially private (DP) methods. Traditional DP approaches often require adding noise based solely on private data, which can significantly degrade utility. In this paper, we address this limitation in the context of the ordinary least squares estimator (OLSE) of linear regression based on sufficient statistics perturbation (SSP) under the unbounded data assumption. We propose a novel method that involves transforming private data using the public second-moment matrix to compute a transformed SSP-OLSE, whose second-moment matrix yields a better condition number and improves the OLSE accuracy and robustness. We derive theoretical error bounds about our method and the standard SSP-OLSE to the non-DP OLSE, which reveal the improved robustness and accuracy achieved by our approach. Experiments on synthetic and real-world datasets demonstrate the utility and effectiveness of our method.