LGCCDMMay 21, 2014

Approximate resilience, monotonicity, and the complexity of agnostic learning

arXiv:1405.5268v234 citations
Originality Incremental advance
AI Analysis

This work addresses the complexity of agnostic learning for researchers in computational learning theory, providing a novel characterization and structural results, though it is incremental in building on prior concepts like resilience and SQ framework.

The paper tackles the problem of characterizing the complexity of agnostic learning over uniform distributions by introducing the notion of approximate resilience, showing that if functions in a class are far from d-resilient, agnostic learning is efficient (time n^O(d)), while if close, it is hard (complexity at least n^Ω(d) in the SQ framework). It also constructs monotone functions that are close to highly resilient, implying nearly optimal lower bounds for agnostic learning of monotone juntas.

A function $f$ is $d$-resilient if all its Fourier coefficients of degree at most $d$ are zero, i.e., $f$ is uncorrelated with all low-degree parities. We study the notion of $\mathit{approximate}$ $\mathit{resilience}$ of Boolean functions, where we say that $f$ is $α$-approximately $d$-resilient if $f$ is $α$-close to a $[-1,1]$-valued $d$-resilient function in $\ell_1$ distance. We show that approximate resilience essentially characterizes the complexity of agnostic learning of a concept class $C$ over the uniform distribution. Roughly speaking, if all functions in a class $C$ are far from being $d$-resilient then $C$ can be learned agnostically in time $n^{O(d)}$ and conversely, if $C$ contains a function close to being $d$-resilient then agnostic learning of $C$ in the statistical query (SQ) framework of Kearns has complexity of at least $n^{Ω(d)}$. This characterization is based on the duality between $\ell_1$ approximation by degree-$d$ polynomials and approximate $d$-resilience that we establish. In particular, it implies that $\ell_1$ approximation by low-degree polynomials, known to be sufficient for agnostic learning over product distributions, is in fact necessary. Focusing on monotone Boolean functions, we exhibit the existence of near-optimal $α$-approximately $\widetildeΩ(α\sqrt{n})$-resilient monotone functions for all $α>0$. Prior to our work, it was conceivable even that every monotone function is $Ω(1)$-far from any $1$-resilient function. Furthermore, we construct simple, explicit monotone functions based on ${\sf Tribes}$ and ${\sf CycleRun}$ that are close to highly resilient functions. Our constructions are based on a fairly general resilience analysis and amplification. These structural results, together with the characterization, imply nearly optimal lower bounds for agnostic learning of monotone juntas.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes