Rocco Servedio

4papers

102citations

Novelty56%

AI Score26

Ranked #166,320 of 205,806 authors (top 81%)#36,183 in LG (top 85%)

4 Papers

LGFeb 10, 2022

Near-Optimal Statistical Query Lower Bounds for Agnostically Learning Intersections of Halfspaces with Gaussian Marginals

Daniel Hsu, Clayton Sanford, Rocco Servedio et al.

We consider the well-studied problem of learning intersections of halfspaces under the Gaussian distribution in the challenging \emph{agnostic learning} model. Recent work of Diakonikolas et al. (2021) shows that any Statistical Query (SQ) algorithm for agnostically learning the class of intersections of $k$ halfspaces over $\mathbb{R}^n$ to constant excess error either must make queries of tolerance at most $n^{-\tildeΩ(\sqrt{\log k})}$ or must make $2^{n^{Ω(1)}}$ queries. We strengthen this result by improving the tolerance requirement to $n^{-\tildeΩ(\log k)}$. This lower bound is essentially best possible since an SQ algorithm of Klivans et al. (2008) agnostically learns this class to any constant excess error using $n^{O(\log k)}$ queries of tolerance $n^{-O(\log k)}$. We prove two variants of our lower bound, each of which combines ingredients from Diakonikolas et al. (2021) with (an extension of) a different earlier approach for agnostic SQ lower bounds for the Boolean setting due to Dachman-Soled et al. (2014). Our approach also yields lower bounds for agnostically SQ learning the class of "convex subspace juntas" (studied by Vempala, 2010) and the class of sets with bounded Gaussian surface area; all of these lower bounds are nearly optimal since they essentially match known upper bounds from Klivans et al. (2008).

LGNov 3, 2018

Learning sparse mixtures of rankings from noisy information

Anindya De, Ryan O'Donnell, Rocco Servedio

We study the problem of learning an unknown mixture of $k$ rankings over $n$ elements, given access to noisy samples drawn from the unknown mixture. We consider a range of different noise models, including natural variants of the "heat kernel" noise framework and the Mallows model. For each of these noise models we give an algorithm which, under mild assumptions, learns the unknown mixture to high accuracy and runs in $n^{O(\log k)}$ time. The best previous algorithms for closely related problems have running times which are exponential in $k$.

DSMar 4, 2017

Sharp bounds for population recovery

Anindya De, Ryan O'Donnell, Rocco Servedio

The population recovery problem is a basic problem in noisy unsupervised learning that has attracted significant research attention in recent years [WY12,DRWY12, MS13, BIMP13, LZ15,DST16]. A number of different variants of this problem have been studied, often under assumptions on the unknown distribution (such as that it has restricted support size). In this work we study the sample complexity and algorithmic complexity of the most general version of the problem, under both bit-flip noise and erasure noise model. We give essentially matching upper and lower sample complexity bounds for both noise models, and efficient algorithms matching these sample complexity bounds up to polynomial factors.

CCDec 9, 2016

Optimal mean-based algorithms for trace reconstruction

Anindya De, Ryan O'Donnell, Rocco Servedio

In the (deletion-channel) trace reconstruction problem, there is an unknown $n$-bit source string $x$. An algorithm is given access to independent traces of $x$, where a trace is formed by deleting each bit of~$x$ independently with probability~$δ$. The goal of the algorithm is to recover~$x$ exactly (with high probability), while minimizing samples (number of traces) and running time. Previously, the best known algorithm for the trace reconstruction problem was due to Holenstein~et~al.; it uses $\exp(\tilde{O}(n^{1/2}))$ samples and running time for any fixed $0 < δ< 1$. It is also what we call a "mean-based algorithm", meaning that it only uses the empirical means of the individual bits of the traces. Holenstein~et~al.~also gave a lower bound, showing that any mean-based algorithm must use at least $n^{\tildeΩ(\log n)}$ samples. In this paper we improve both of these results, obtaining matching upper and lower bounds for mean-based trace reconstruction. For any constant deletion rate $0 < δ< 1$, we give a mean-based algorithm that uses $\exp(O(n^{1/3}))$ time and traces; we also prove that any mean-based algorithm must use at least $\exp(Ω(n^{1/3}))$ traces. In fact, we obtain matching upper and lower bounds even for $δ$ subconstant and $ρ:= 1-δ$ subconstant: when $(\log^3 n)/n \ll δ\leq 1/2$ the bound is $\exp(-Θ(δn)^{1/3})$, and when $1/\sqrt{n} \ll ρ\leq 1/2$ the bound is $\exp(-Θ(n/ρ)^{1/3})$. Our proofs involve estimates for the maxima of Littlewood polynomials on complex disks. We show that these techniques can also be used to perform trace reconstruction with random insertions and bit-flips in addition to deletions. We also find a surprising result: for deletion probabilities $δ> 1/2$, the presence of insertions can actually help with trace reconstruction.