LGMar 24, 2025

Improved Rates of Differentially Private Nonconvex-Strongly-Concave Minimax Optimization

Ruijia Zhang, Mingxi Lei, Meng Ding, Zihang Xiang, Jinhui Xu, Di Wang

arXiv:2503.18317v17 citationsh-index: 14AAAI

Originality Incremental advance

AI Analysis

This work addresses privacy-preserving optimization for machine learning practitioners dealing with complex models like deep AUC maximization, though it is incremental as it builds on existing DP methods.

The paper tackles differentially private minimax optimization for nonconvex-strongly-concave problems, common in deep learning, by first analyzing a DP-SGDA method and then proposing a new method that reduces gradient noise variance, improving the gradient norm bound from $ ilde{O}(rac{d^{1/4}}{(nε)^{1/2}})$ to $ ilde{O}(rac{d^{1/3}}{(nε)^{2/3}})$, matching the best-known result for DP non-convex empirical risk minimization.

In this paper, we study the problem of (finite sum) minimax optimization in the Differential Privacy (DP) model. Unlike most of the previous studies on the (strongly) convex-concave settings or loss functions satisfying the Polyak-Lojasiewicz condition, here we mainly focus on the nonconvex-strongly-concave one, which encapsulates many models in deep learning such as deep AUC maximization. Specifically, we first analyze a DP version of Stochastic Gradient Descent Ascent (SGDA) and show that it is possible to get a DP estimator whose $l_2$-norm of the gradient for the empirical risk function is upper bounded by $\tilde{O}(\frac{d^{1/4}}{({nε})^{1/2}})$, where $d$ is the model dimension and $n$ is the sample size. We then propose a new method with less gradient noise variance and improve the upper bound to $\tilde{O}(\frac{d^{1/3}}{(nε)^{2/3}})$, which matches the best-known result for DP Empirical Risk Minimization with non-convex loss. We also discussed several lower bounds of private minimax optimization. Finally, experiments on AUC maximization, generative adversarial networks, and temporal difference learning with real-world data support our theoretical analysis.

View on arXiv PDF

Similar