LGCCDSApr 8, 2013

Learning Coverage Functions and Private Release of Marginals

arXiv:1304.2079v343 citations
Originality Incremental advance
AI Analysis

This work addresses the challenge of learning coverage functions efficiently, which has applications in private data release, but is incremental as it builds on prior structural insights and hardness results.

The paper tackles the problem of approximating and learning coverage functions, a subclass of submodular functions, by providing the first fully-polynomial algorithm in the PMAC model that achieves a factor of 1+γ approximation on all but a δ-fraction of points in poly(n,1/γ,1/δ) time, and also shows learnability with excess ℓ1-error ε over product and symmetric distributions in n^{log(1/ε)} time.

We study the problem of approximating and learning coverage functions. A function $c: 2^{[n]} \rightarrow \mathbf{R}^{+}$ is a coverage function, if there exists a universe $U$ with non-negative weights $w(u)$ for each $u \in U$ and subsets $A_1, A_2, \ldots, A_n$ of $U$ such that $c(S) = \sum_{u \in \cup_{i \in S} A_i} w(u)$. Alternatively, coverage functions can be described as non-negative linear combinations of monotone disjunctions. They are a natural subclass of submodular functions and arise in a number of applications. We give an algorithm that for any $γ,δ>0$, given random and uniform examples of an unknown coverage function $c$, finds a function $h$ that approximates $c$ within factor $1+γ$ on all but $δ$-fraction of the points in time $poly(n,1/γ,1/δ)$. This is the first fully-polynomial algorithm for learning an interesting class of functions in the demanding PMAC model of Balcan and Harvey (2011). Our algorithms are based on several new structural properties of coverage functions. Using the results in (Feldman and Kothari, 2014), we also show that coverage functions are learnable agnostically with excess $\ell_1$-error $ε$ over all product and symmetric distributions in time $n^{\log(1/ε)}$. In contrast, we show that, without assumptions on the distribution, learning coverage functions is at least as hard as learning polynomial-size disjoint DNF formulas, a class of functions for which the best known algorithm runs in time $2^{\tilde{O}(n^{1/3})}$ (Klivans and Servedio, 2004). As an application of our learning results, we give simple differentially-private algorithms for releasing monotone conjunction counting queries with low average error. In particular, for any $k \leq n$, we obtain private release of $k$-way marginals with average error $\barα$ in time $n^{O(\log(1/\barα))}$.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes