Improved Regret in Stochastic Decision-Theoretic Online Learning under Differential Privacy
This work addresses a theoretical gap in privacy-preserving online learning, providing tighter bounds that are incremental but important for the field.
The paper tackles the open problem of optimal instance-dependent rates for stochastic decision-theoretic online learning under differential privacy, achieving an improved upper bound of O(log K / Δ_min + log^2 K / ε) that is independent of T and has only logarithmic dependency in K, and introduces a deterministic setting with matching upper and lower bounds of Θ(log K / ε).
Hu and Mehta (2024) posed an open problem: what is the optimal instance-dependent rate for the stochastic decision-theoretic online learning (with $K$ actions and $T$ rounds) under $\varepsilon$-differential privacy? Before, the best known upper bound and lower bound are $O\left(\frac{\log K}{Δ_{\min}} + \frac{\log K\log T}{\varepsilon}\right)$ and $Ω\left(\frac{\log K}{Δ_{\min}} + \frac{\log K}{\varepsilon}\right)$ (where $Δ_{\min}$ is the gap between the optimal and the second actions). In this paper, we partially address this open problem by having two new results. First, we provide an improved upper bound for this problem $O\left(\frac{\log K}{Δ_{\min}} + \frac{\log^2K}{\varepsilon}\right)$, which is $T$-independent and only has a log dependency in $K$. Second, to further understand the gap, we introduce the \textit{deterministic setting}, a weaker setting of this open problem, where the received loss vector is deterministic. At this weaker setting, a direct application of the analysis and algorithms from the original setting still leads to an extra log factor. We conduct a novel analysis which proves upper and lower bounds that match at $Θ(\frac{\log K}{\varepsilon})$.