LGFeb 18

Differentially Private Non-convex Distributionally Robust Optimization

Difei Xu, Meng Ding, Zebin Ma, Huanyi Xie, Youming Tao, Aicha Slaitane, Di Wang

arXiv:2602.16155v12.71 citationsh-index: 4

Originality Incremental advance

AI Analysis

This work addresses robustness and privacy in machine learning for applications with sensitive data and distribution shifts, representing an incremental advance by extending DP to DRO.

The paper tackles the problem of making Distributionally Robust Optimization (DRO) differentially private for non-convex losses, achieving utility bounds such as O(1/√n + (√d log(1/δ)/(nε))^{2/3}) for general ψ-divergence and matching the best-known result for DP-ERM with KL-divergence.

Real-world deployments routinely face distribution shifts, group imbalances, and adversarial perturbations, under which the traditional Empirical Risk Minimization (ERM) framework can degrade severely. Distributionally Robust Optimization (DRO) addresses this issue by optimizing the worst-case expected loss over an uncertainty set of distributions, offering a principled approach to robustness. Meanwhile, as training data in DRO always involves sensitive information, safeguarding it against leakage under Differential Privacy (DP) is essential. In contrast to classical DP-ERM, DP-DRO has received much less attention due to its minimax optimization structure with uncertainty constraint. To bridge the gap, we provide a comprehensive study of DP-(finite-sum)-DRO with $ψ$-divergence and non-convex loss. First, we study DRO with general $ψ$-divergence by reformulating it as a minimization problem, and develop a novel $(\varepsilon, δ)$-DP optimization method, called DP Double-Spider, tailored to this structure. Under mild assumptions, we show that it achieves a utility bound of $\mathcal{O}(\frac{1}{\sqrt{n}}+ (\frac{\sqrt{d \log (1/δ)}}{n \varepsilon})^{2/3})$ in terms of the gradient norm, where $n$ denotes the data size and $d$ denotes the model dimension. We further improve the utility rate for specific divergences. In particular, for DP-DRO with KL-divergence, by transforming the problem into a compositional finite-sum optimization problem, we develop a DP Recursive-Spider method and show that it achieves a utility bound of $\mathcal{O}((\frac{\sqrt{d \log(1/δ)}}{n\varepsilon})^{2/3} )$, matching the best-known result for non-convex DP-ERM. Experimentally, we demonstrate that our proposed methods outperform existing approaches for DP minimax optimization.

View on arXiv PDF

Similar