Learning versus Refutation in Noninteractive Local Differential Privacy
This provides foundational insights for privacy-preserving machine learning, addressing theoretical limits in non-interactive LDP settings.
The paper characterizes the sample complexity of agnostic PAC learning in non-interactive local differential privacy, showing it is determined by the approximate γ₂ norm of a matrix, and establishes an equivalence between learning and refutation tasks.
We study two basic statistical tasks in non-interactive local differential privacy (LDP): learning and refutation. Learning requires finding a concept that best fits an unknown target function (from labelled samples drawn from a distribution), whereas refutation requires distinguishing between data distributions that are well-correlated with some concept in the class, versus distributions where the labels are random. Our main result is a complete characterization of the sample complexity of agnostic PAC learning for non-interactive LDP protocols. We show that the optimal sample complexity for any concept class is captured by the approximate $γ_2$~norm of a natural matrix associated with the class. Combined with previous work [Edmonds, Nikolov and Ullman, 2019] this gives an equivalence between learning and refutation in the agnostic setting.