LGMay 18, 2022
Slowly Changing Adversarial Bandit Algorithms are Efficient for Discounted MDPsIan A. Kash, Lev Reyzin, Zishun Yu
Reinforcement learning generalizes multi-armed bandit problems with additional difficulties of a longer planning horizon and unknown transition kernel. We explore a black-box reduction from discounted infinite-horizon tabular reinforcement learning to multi-armed bandits, where, specifically, an independent bandit learner is placed in each state. We show that, under ergodicity and fast mixing assumptions, any slowly changing adversarial bandit algorithm achieving optimal regret in the adversarial bandit setting can also attain optimal expected regret in infinite-horizon discounted Markov decision processes, with respect to the number of rounds $T$. Furthermore, we examine our reduction using a specific instance of the exponential-weight algorithm.
17.2MLMay 19
Contradiction Graphs Determine VC DimensionJesse Campbell, Daniel Ibaibarriaga, Lev Reyzin
We study the contradiction graphs associated with binary concept classes. For a class $H \subseteq \{0,1\}^X$, the order-$m$ contradiction graph $G_m(H)$ has as vertices the $H$-realizable labeled sequences of length $m$, with two vertices adjacent when the two sequences assign opposite labels to some common domain point. Our main result is that the single graph $G_m(H)$ determines the threshold predicate $\mathrm{VCdim}(H)\ge m$. Consequently, the full sequence $(G_m(H))_{m \ge 1}$ determines the exact VC dimension and, in particular, detects finite versus infinite VC dimension, answering a question posed by Alon et al. (2024).
LGOct 7, 2023
Applications of Littlestone dimension to query learning and to compressionHunter Chase, James Freitag, Lev Reyzin
In this paper we give several applications of Littlestone dimension. The first is to the model of \cite{angluin2017power}, where we extend their results for learning by equivalence queries with random counterexamples. Second, we extend that model to infinite concept classes with an additional source of randomness. Third, we give improved results on the relationship of Littlestone dimension to classes with extended $d$-compression schemes, proving a strong version of a conjecture of \cite{floyd1995sample} for Littlestone dimension.
LGApr 14, 2022
A Unified Analysis of Dynamic Interactive LearningXing Gao, Thomas Maranzatto, Lev Reyzin
In this paper we investigate the problem of learning evolving concepts over a combinatorial structure. Previous work by Emamjomeh-Zadeh et al. [2020] introduced dynamics into interactive learning as a way to model non-static user preferences in clustering problems or recommender systems. We provide many useful contributions to this problem. First, we give a framework that captures both of the models analyzed by [Emamjomeh-Zadeh et al., 2020], which allows us to study any type of concept evolution and matches the same query complexity bounds and running time guarantees of the previous models. Using this general model we solve the open problem of closing the gap between the upper and lower bounds on query complexity. Finally, we study an efficient algorithm where the learner simply follows the feedback at each round, and we provide mistake bounds for low diameter graphs such as cliques, stars, and general o(log n) diameter graphs by using a Markov Chain model.
0.9CCApr 8
Multiple Planted Structures Below $\sqrt{n}$: An SoS Integrality Gap and an SQ Lower BoundMatvey Mosievskiy, Lev Reyzin
We study computational limitations in \emph{multi-plant} average-case inference problems, in which $t$ disjoint planted structures of size $k$ are embedded in a random background on $n$ elements. A natural parameter in this setting is the total planted size $K := kt$. For several classic planted-subgraph problems, including planted clique, existing algorithmic and lower-bound evidence suggests a characteristic computational threshold near $\sqrt{n}$ in the single-plant setting. Our main result is a Sum-of-Squares (SoS) integrality gap for refuting the presence of multiple planted cliques. Specifically, for $G \sim G(n,1/2)$, we construct a degree-$d$ SoS pseudoexpectation for the natural relaxation that maximizes the total size of up to $t$ disjoint cliques. Throughout the regime $kt \le n^{1/2 - c\sqrt{d/\log n}},$ for a universal constant $c>0$, this relaxation achieves objective value $kt(1-o(1))$, and therefore degree-$d$ SoS cannot certify an upper bound below $kt$. This extends the planted-clique SoS lower bounds of~\cite{BarakHKKMP19} to a multi-plant setting with explicit disjointness constraints. As complementary evidence from a different computational model, we prove a lower bound in the statistical query (SQ) framework, extending the results of~\cite{FeldmanGRVX17}. We show that for detecting $t$ disjoint planted $k \times k$ bicliques (equivalently, a row-mixture distribution), when $kt = O(n^{1/2-δ})$ for any fixed $δ>0$, no polynomial-time SQ algorithm can distinguish the planted and null distributions with constant advantage.
LGOct 6, 2025
On the Hardness of Learning Regular ExpressionsIdan Attias, Lev Reyzin, Nathan Srebro et al.
Despite the theoretical significance and wide practical use of regular expressions, the computational complexity of learning them has been largely unexplored. We study the computational hardness of improperly learning regular expressions in the PAC model and with membership queries. We show that PAC learning is hard even under the uniform distribution on the hypercube, and also prove hardness of distribution-free learning with membership queries. Furthermore, if regular expressions are extended with complement or intersection, we establish hardness of learning with membership queries even under the uniform distribution. We emphasize that these results do not follow from existing hardness results for learning DFAs or NFAs, since the descriptive complexity of regular languages can differ exponentially between DFAs, NFAs, and regular expressions.
DSMay 9, 2025
Learning-Augmented Algorithms for Boolean SatisfiabilityIdan Attias, Xing Gao, Lev Reyzin
Learning-augmented algorithms are a prominent recent development in beyond worst-case analysis. In this framework, a problem instance is provided with a prediction (``advice'') from a machine-learning oracle, which provides partial information about an optimal solution, and the goal is to design algorithms that leverage this advice to improve worst-case performance. We study the classic Boolean satisfiability (SAT) decision and optimization problems within this framework using two forms of advice. ``Subset advice" provides a random $ε$ fraction of the variables from an optimal assignment, whereas ``label advice" provides noisy predictions for all variables in an optimal assignment. For the decision problem $k$-SAT, by using the subset advice we accelerate the exponential running time of the PPSZ family of algorithms due to Paturi, Pudlak, Saks and Zane, which currently represent the state of the art in the worst case. We accelerate the running time by a multiplicative factor of $2^{-c}$ in the base of the exponent, where $c$ is a function of $ε$ and $k$. For the optimization problem, we show how to incorporate subset advice in a black-box fashion with any $α$-approximation algorithm, improving the approximation ratio to $α+ (1 - α)ε$. Specifically, we achieve approximations of $0.94 + Ω(ε)$ for MAX-$2$-SAT, $7/8 + Ω(ε)$ for MAX-$3$-SAT, and $0.79 + Ω(ε)$ for MAX-SAT. Moreover, for label advice, we obtain near-optimal approximation for instances with large average degree, thereby generalizing recent results on MAX-CUT and MAX-$2$-LIN.
ITJan 22, 2025
Non-adaptive Learning of Random Hypergraphs with QueriesBethany Austhof, Lev Reyzin, Erasmo Tani
We study the problem of learning a hidden hypergraph $G=(V,E)$ by making a single batch of queries (non-adaptively). We consider the hyperedge detection model, in which every query must be of the form: ``Does this set $S\subseteq V$ contain at least one full hyperedge?'' In this model, it is known that there is no algorithm that allows to non-adaptively learn arbitrary hypergraphs by making fewer than $Ω(\min\{m^2\log n, n^2\})$ even when the hypergraph is constrained to be $2$-uniform (i.e. the hypergraph is simply a graph). Recently, Li et al. overcame this lower bound in the setting in which $G$ is a graph by assuming that the graph learned is sampled from an Erdős-Rényi model. We generalize the result of Li et al. to the setting of random $k$-uniform hypergraphs. To achieve this result, we leverage a novel equivalence between the problem of learning a single hyperedge and the standard group testing problem. This latter result may also be of independent interest.
LGDec 19, 2020
Communication-Aware Collaborative LearningAvrim Blum, Shelby Heinecke, Lev Reyzin
Algorithms for noiseless collaborative PAC learning have been analyzed and optimized in recent years with respect to sample complexity. In this paper, we study collaborative PAC learning with the goal of reducing communication cost at essentially no penalty to the sample complexity. We develop communication efficient collaborative PAC learning algorithms using distributed boosting. We then consider the communication cost of collaborative learning in the presence of classification noise. As an intermediate step, we show how collaborative PAC learning algorithms can be adapted to handle classification noise. With this insight, we develop communication efficient algorithms for collaborative PAC learning robust to classification noise.
LGApr 7, 2020
On the Complexity of Learning from Label ProportionsBenjamin Fish, Lev Reyzin
In the problem of learning with label proportions, which we call LLP learning, the training data is unlabeled, and only the proportions of examples receiving each label are given. The goal is to learn a hypothesis that predicts the proportions of labels on the distribution underlying the sample. This model of learning is applicable to a wide variety of settings, including predicting the number of votes for candidates in political elections from polls. In this paper, we formally define this class and resolve foundational questions regarding the computational complexity of LLP and characterize its relationship to PAC learning. Among our results, we show, perhaps surprisingly, that for finite VC classes what can be efficiently LLP learned is a strict subset of what can be leaned efficiently in PAC, under standard complexity assumptions. We also show that there exist classes of functions whose learnability in LLP is independent of ZFC, the standard set theoretic axioms. This implies that LLP learning cannot be easily characterized (like PAC by VC dimension).
LGApr 1, 2020
Statistical Queries and Statistical Algorithms: Foundations and ApplicationsLev Reyzin
We give a survey of the foundations of statistical queries and their many applications to other areas. We introduce the model, give the main definitions, and we explore the fundamental theory statistical queries and how how it connects to various notions of learnability. We also give a detailed summary of some of the applications of statistical queries to other areas, including to optimization, to evolvability, and to differential privacy.
LGMar 30, 2020
On Biased Random Walks, Corrupted Intervals, and Learning Under Adversarial DesignDaniel Berend, Aryeh Kontorovich, Lev Reyzin et al.
We tackle some fundamental problems in probability theory on corrupted random processes on the integer line. We analyze when a biased random walk is expected to reach its bottommost point and when intervals of integer points can be detected under a natural model of noise. We apply these results to problems in learning thresholds and intervals under a new model for learning under adversarial design.
DSFeb 26, 2020
On Learning a Hidden Directed Graph with Path QueriesMano Vikash Janardhanan, Lev Reyzin
In this paper, we consider the problem of reconstructing a directed graph using path queries. In this query model of learning, a graph is hidden from the learner, and the learner can access information about it with path queries. For a source and destination node, a path query returns whether there is a directed path from the source to the destination node in the hidden graph. In this paper we first give bounds for learning graphs on $n$ vertices and $k$ strongly connected components. We then study the case of bounded degree directed trees and give new algorithms for learning "almost-trees" -- directed trees to which extra edges have been added. We also give some lower bound constructions justifying our approach.
LGFeb 12, 2019
Crowdsourced PAC Learning under Classification NoiseShelby Heinecke, Lev Reyzin
In this paper, we analyze PAC learnability from labels produced by crowdsourcing. In our setting, unlabeled examples are drawn from a distribution and labels are crowdsourced from workers who operate under classification noise, each with their own noise parameter. We develop an end-to-end crowdsourced PAC learning algorithm that takes unlabeled data points as input and outputs a trained classifier. Our three-step algorithm incorporates majority voting, pure-exploration bandits, and noisy-PAC learning. We prove several guarantees on the number of tasks labeled by workers for PAC learning in this setting and show that our algorithm improves upon the baseline by reducing the total number of tasks given to workers. We demonstrate the robustness of our algorithm by exploring its application to additional realistic crowdsourcing settings.
LGSep 28, 2017
Sampling Without Compromising Accuracy in Adaptive Data AnalysisBenjamin Fish, Lev Reyzin, Benjamin I. P. Rubinstein
In this work, we study how to use sampling to speed up mechanisms for answering adaptive queries into datasets without reducing the accuracy of those mechanisms. This is important to do when both the datasets and the number of queries asked are very large. In particular, we describe a mechanism that provides a polynomial speed-up per query over previous mechanisms, without needing to increase the total amount of data required to maintain the same generalization error as before. We prove that this speed-up holds for arbitrary statistical queries. We also provide an even faster method for achieving statistically-meaningful responses wherein the mechanism is only allowed to see a constant number of samples from the data per query. Finally, we show that our general results yield a simple, fast, and unified approach for adaptively optimizing convex and strongly convex functions over a dataset.