Nearly Optimal Differentially Private ReLU Regression
This work addresses the problem of private nonconvex learning for researchers in machine learning and data privacy, offering a nearly optimal algorithm with broader applicability than prior methods.
The paper tackles ReLU regression under differential privacy by relaxing previous stringent assumptions to a more standard i.i.d. sub-Gaussian setting, achieving an excess population risk bound of ̃O(d^2/(N^2ε^2)) that is shown to be nearly optimal up to logarithmic factors.
In this paper, we investigate one of the most fundamental nonconvex learning problems, ReLU regression, in the Differential Privacy (DP) model. Previous studies on private ReLU regression heavily rely on stringent assumptions, such as constant bounded norms for feature vectors and labels. We relax these assumptions to a more standard setting, where data can be i.i.d. sampled from $O(1)$-sub-Gaussian distributions. We first show that when $\varepsilon = \tilde{O}(\sqrt{\frac{1}{N}})$ and there is some public data, it is possible to achieve an upper bound of $\tilde{O}(\frac{d^2}{N^2 \varepsilon^2})$ for the excess population risk in $(ε, δ)$-DP, where $d$ is the dimension and $N$ is the number of data samples. Moreover, we relax the requirement of $ε$ and public data by proposing and analyzing a one-pass mini-batch Generalized Linear Model Perceptron algorithm (DP-MBGLMtron). Additionally, using the tracing attack argument technique, we demonstrate that the minimax rate of the estimation error for $(\varepsilon, δ)$-DP algorithms is lower bounded by $Ω(\frac{d^2}{N^2 \varepsilon^2})$. This shows that DP-MBGLMtron achieves the optimal utility bound up to logarithmic factors. Experiments further support our theoretical results.