The Power of Sampling: Dimension-free Risk Bounds in Private ERM
This work addresses fundamental bottlenecks in private optimization for machine learning, offering a novel approach to improve efficiency and applicability in high-dimensional settings, though it is incremental in building on existing sampling techniques.
The paper tackled the challenges of differentially private empirical risk minimization (DP-ERM) for large-scale models, such as dimension dependence and non-smooth objectives, by proposing an algorithm using the regularized exponential mechanism with samplers, achieving rank-dependent risk bounds for non-smooth convex objectives with zeroth-order oracles, which prior methods did not accomplish.
Differentially private empirical risk minimization (DP-ERM) is a fundamental problem in private optimization. While the theory of DP-ERM is well-studied, as large-scale models become prevalent, traditional DP-ERM methods face new challenges, including (1) the prohibitive dependence on the ambient dimension, (2) the highly non-smooth objective functions, (3) costly first-order gradient oracles. Such challenges demand rethinking existing DP-ERM methodologies. In this work, we show that the regularized exponential mechanism combined with existing samplers can address these challenges altogether: under the standard unconstrained domain and low-rank gradients assumptions, our algorithm can achieve rank-dependent risk bounds for non-smooth convex objectives using only zeroth order oracles, which was not accomplished by prior methods. This highlights the power of sampling in differential privacy. We further construct lower bounds, demonstrating that when gradients are full-rank, there is no separation between the constrained and unconstrained settings. Our lower bound is derived from a general black-box reduction from unconstrained to the constrained domain and an improved lower bound in the constrained setting, which might be of independent interest.