LGFeb 5

Private Prediction via Shrinkage

arXiv:2602.05219v1

Originality Highly original

AI Analysis

This work addresses the challenge of efficient private prediction in streaming settings, offering significant improvements in sample complexity for machine learning practitioners dealing with privacy constraints.

The paper tackles the problem of differentially private prediction, where an algorithm must answer a stream of queries while maintaining privacy with respect to a labeled sample set, reducing the dependence on the number of queries from square-root to polylogarithmic. For an oblivious adversary and any concept class, it uses O~(VC(C)^{3.5} log^{3.5} T) labeled examples, and for an adaptive adversary and halfspaces, it uses O~(d^{5.5} log T) examples.

We study differentially private prediction introduced by Dwork and Feldman (COLT 2018): an algorithm receives one labeled sample set $S$ and then answers a stream of unlabeled queries while the output transcript remains $(\varepsilon,δ)$-differentially private with respect to $S$. Standard composition yields a $\sqrt{T}$ dependence for $T$ queries. We show that this dependence can be reduced to polylogarithmic in $T$ in streaming settings. For an oblivious online adversary and any concept class $\mathcal{C}$, we give a private predictor that answers $T$ queries with $|S|= \tilde{O}(VC(\mathcal{C})^{3.5}\log^{3.5}T)$ labeled examples. For an adaptive online adversary and halfspaces over $\mathbb{R}^d$, we obtain $|S|=\tilde{O}\left(d^{5.5}\log T\right)$.

View on arXiv PDF

Similar