Lucas Slot

LG
h-index6
4papers
17citations
Novelty50%
AI Score42

4 Papers

LGMar 6
Agnostic learning in (almost) optimal time via Gaussian surface area

Lucas Pesenti, Lucas Slot, Manuel Wiedmer

The complexity of learning a concept class under Gaussian marginals in the difficult agnostic model is closely related to its $L_1$-approximability by low-degree polynomials. For any concept class with Gaussian surface area at most $Γ$, Klivans et al. (2008) show that degree $d = O(Γ^2 / \varepsilon^4)$ suffices to achieve an $\varepsilon$-approximation. This leads to the best-known bounds on the complexity of learning a variety of concept classes. In this note, we improve their analysis by showing that degree $d = \tilde O (Γ^2 / \varepsilon^2)$ is enough. In light of lower bounds due to Diakonikolas et al. (2021), this yields (near) optimal bounds on the complexity of agnostically learning polynomial threshold functions in the statistical query model. Our proof relies on a direct analogue of a construction of Feldman et al. (2020), who considered $L_1$-approximation on the Boolean hypercube.

15.5OCMay 1
On the Distribution of Unweighted Minimum Knapsack Instances with Large SOS Rank

Adam Kurpisz, Lucas Slot, Mikhail Zaytsev

We analyze the sum-of-squares rank of unweighted instances of the Minimum Knapsack (MK) problem, i.e., minimization of $\sum_{i=1}^n x_i$ for 0/1 variables under the constraint $\sum_{i=1}^n x_i \geq q$, with $q \in \mathbb{R}$. Such instances have long served as a testbed for understanding the limitations of lift-and-project methods in Boolean optimization. For example, both the Lovász-Schrijver and Sherali-Adams hierarchies require (maximal) rank $n$ to solve them, already when $q=1/2$ is constant. The SOS hierarchy requires only \emph{sublinear} rank $O(\sqrt{n})$ to solve unweighted MK when $q=1/2$. On the other hand, when $q$ is allowed to vary with~$n$, the SOS rank of the problem may become linear. Interestingly, this is known to happen both when $q$ is large, and when $q$ is very small ($0<q \leq 2^{-n}$). This raises the question of whether we should think of hard instances of unweighted MK as being typical for the SOS hierarchy, or as a consequence of very specific choices of the threshold parameter $q$. In this paper, we address this question by showing new upper and lower bounds on the SOS rank of unweighted MK in the whole regime of the parameter $q$. For $n-q \leq O(1)$, we show that the SOS rank is constant. In contrast, when $q \leq O(1)$, a linear rank is needed if $q$ is exponentially close to an integer. As our main positive result, we show that linear rank is very rare for $q \leq O(1)$. This can be expressed in the language of smoothed analysis: after perturbing $q$ by a Gaussian with mean $0$ and variance $σ^2$, the expected SOS rank of MK is $O(\sqrt{n} \log (n/σ))$.

CCFeb 20, 2025
Low degree conjecture implies sharp computational thresholds in stochastic block model

Jingqiu Ding, Yiding Hua, Lucas Slot et al.

We investigate implications of the (extended) low-degree conjecture (recently formalized in [MW23]) in the context of the symmetric stochastic block model. Assuming the conjecture holds, we establish that no polynomial-time algorithm can weakly recover community labels below the Kesten-Stigum (KS) threshold. In particular, we rule out polynomial-time estimators that, with constant probability, achieve correlation with the true communities that is significantly better than random. Whereas, above the KS threshold, polynomial-time algorithms are known to achieve constant correlation with the true communities with high probability[Mas14,AS15]. To our knowledge, we provide the first rigorous evidence for the sharp transition in recovery rate for polynomial-time algorithms at the KS threshold. Notably, under a stronger version of the low-degree conjecture, our lower bound remains valid even when the number of blocks diverges. Furthermore, our results provide evidence of a computational-to-statistical gap in learning the parameters of stochastic block models. In contrast to prior work, which either (i) rules out polynomial-time algorithms for hypothesis testing with 1-o(1) success probability [Hopkins18, BBK+21a] under the low-degree conjecture, or (ii) rules out low-degree polynomials for learning the edge connection probability matrix [LG23], our approach provides stronger lower bounds on the recovery and learning problem. Our proof combines low-degree lower bounds from [Hopkins18, BBK+21a] with graph splitting and cross-validation techniques. In order to rule out general recovery algorithms, we employ the correlation preserving projection method developed in [HS17].

LGJun 10, 2024
Testably Learning Polynomial Threshold Functions

Lucas Slot, Stefan Tiegel, Manuel Wiedmer

Rubinfeld & Vasilyan recently introduced the framework of testable learning as an extension of the classical agnostic model. It relaxes distributional assumptions which are difficult to verify by conditions that can be checked efficiently by a tester. The tester has to accept whenever the data truly satisfies the original assumptions, and the learner has to succeed whenever the tester accepts. We focus on the setting where the tester has to accept standard Gaussian data. There, it is known that basic concept classes such as halfspaces can be learned testably with the same time complexity as in the (distribution-specific) agnostic model. In this work, we ask whether there is a price to pay for testably learning more complex concept classes. In particular, we consider polynomial threshold functions (PTFs), which naturally generalize halfspaces. We show that PTFs of arbitrary constant degree can be testably learned up to excess error $\varepsilon > 0$ in time $n^{\mathrm{poly}(1/\varepsilon)}$. This qualitatively matches the best known guarantees in the agnostic model. Our results build on a connection between testable learning and fooling. In particular, we show that distributions that approximately match at least $\mathrm{poly}(1/\varepsilon)$ moments of the standard Gaussian fool constant-degree PTFs (up to error $\varepsilon$). As a secondary result, we prove that a direct approach to show testable learning (without fooling), which was successfully used for halfspaces, cannot work for PTFs.