DSCRLGMLMay 30, 2019

Private Hypothesis Selection

arXiv:1905.13229v599 citations
Originality Incremental advance
AI Analysis

This provides a generic tool for private learning across various distribution classes, such as Gaussians and mixtures, enabling optimal sample complexity in many cases, though it builds incrementally on existing hypothesis selection techniques.

The paper tackles the problem of selecting a hypothesis from a set of distributions in a differentially private manner, achieving a sample complexity of O(log m / α^2 + log m / (αε)) for finite classes, which is minimal compared to non-private methods, and extends to infinite classes with relaxed privacy.

We provide a differentially private algorithm for hypothesis selection. Given samples from an unknown probability distribution $P$ and a set of $m$ probability distributions $\mathcal{H}$, the goal is to output, in a $\varepsilon$-differentially private manner, a distribution from $\mathcal{H}$ whose total variation distance to $P$ is comparable to that of the best such distribution (which we denote by $α$). The sample complexity of our basic algorithm is $O\left(\frac{\log m}{α^2} + \frac{\log m}{α\varepsilon}\right)$, representing a minimal cost for privacy when compared to the non-private algorithm. We also can handle infinite hypothesis classes $\mathcal{H}$ by relaxing to $(\varepsilon,δ)$-differential privacy. We apply our hypothesis selection algorithm to give learning algorithms for a number of natural distribution classes, including Gaussians, product distributions, sums of independent random variables, piecewise polynomials, and mixture classes. Our hypothesis selection procedure allows us to generically convert a cover for a class to a learning algorithm, complementing known learning lower bounds which are in terms of the size of the packing number of the class. As the covering and packing numbers are often closely related, for constant $α$, our algorithms achieve the optimal sample complexity for many classes of interest. Finally, we describe an application to private distribution-free PAC learning.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes