Differentially Private $\ell_1$-norm Linear Regression with Heavy-tailed Data
This work addresses privacy-preserving machine learning for non-standard data distributions, offering incremental improvements in handling heavy-tailed data in differential privacy.
The paper tackles differentially private linear regression with heavy-tailed data by proposing algorithms that achieve error bounds of $ ilde{O}(\sqrt{rac{d}{n\epsilon}})$ under bounded second moments and $ ilde{O}(({rac{d}{n\epsilon}})^rac{ heta-1}{ heta})$ under bounded $ heta$-th moments, relaxing the typical Lipschitz assumption.
We study the problem of Differentially Private Stochastic Convex Optimization (DP-SCO) with heavy-tailed data. Specifically, we focus on the $\ell_1$-norm linear regression in the $ε$-DP model. While most of the previous work focuses on the case where the loss function is Lipschitz, here we only need to assume the variates has bounded moments. Firstly, we study the case where the $\ell_2$ norm of data has bounded second order moment. We propose an algorithm which is based on the exponential mechanism and show that it is possible to achieve an upper bound of $\tilde{O}(\sqrt{\frac{d}{nε}})$ (with high probability). Next, we relax the assumption to bounded $θ$-th order moment with some $θ\in (1, 2)$ and show that it is possible to achieve an upper bound of $\tilde{O}(({\frac{d}{nε}})^\frac{θ-1}θ)$. Our algorithms can also be extended to more relaxed cases where only each coordinate of the data has bounded moments, and we can get an upper bound of $\tilde{O}({\frac{d}{\sqrt{nε}}})$ and $\tilde{O}({\frac{d}{({nε})^\frac{θ-1}θ}})$ in the second and $θ$-th moment case respectively.