Youngtak Sohn

ST
4papers
126citations
Novelty79%
AI Score55

4 Papers

88.9STMay 14
Sharp Phase Transitions in Estimation with Low-Degree Polynomials

Youngtak Sohn, Alexander S. Wein

High-dimensional planted problems, such as finding a hidden dense subgraph within a random graph, often exhibit a gap between statistical and computational feasibility. While recovering the hidden structure may be statistically possible, it is conjectured to be computationally intractable in certain parameter regimes. A powerful approach to understanding this hardness involves proving lower bounds on the efficacy of low-degree polynomial algorithms. We introduce new techniques for establishing such lower bounds, leading to novel results across diverse settings: planted submatrix, planted dense subgraph, the spiked Wigner model, and the stochastic block model. Notably, our results address the estimation task -- whereas most prior work is limited to hypothesis testing -- and capture sharp phase transitions such as the "BBP" transition in the spiked Wigner model (named for Baik, Ben Arous, and Péché) and the Kesten-Stigum threshold in the stochastic block model. Existing work on estimation either falls short of achieving these sharp thresholds or is limited to polynomials of very low (constant or logarithmic) degree. In contrast, our results rule out estimation with polynomials of degree $n^δ$ where $n$ is the dimension and $δ> 0$ is a constant, and in some cases we pin down the optimal constant $δ$. Our work resolves open problems posed by Hopkins & Steurer (2017) and Schramm & Wein (2022), and provides rigorous support within the low-degree framework for conjectures by Abbe & Sandon (2018) and Lelarge & Miolane (2019).

91.4STMay 28
Low-degree estimation thresholds in planted hypergraphs and tensor PCA

Daniel Fu, Youngtak Sohn

A central question in high-dimensional statistics is to understand statistical--computational gaps: regimes in which recovering a hidden signal is information-theoretically possible but conjectured to be computationally intractable. The low-degree framework offers a concrete way to study this gap by restricting attention to estimators that are polynomials of degree at most $D$ in the observed data. In this paper, we study low-degree estimation in planted dense subhypergraph, sparse tensor PCA, and tensor PCA with a general prior. For the planted dense subhypergraph model on $n$ vertices, we identify two regimes depending on whether the planted set is larger or smaller than $\sqrt{n}$. Above this scale, we identify a sharp threshold for low-degree estimation. Below this scale, we establish hardness in the regimes predicted by prior work, thereby resolving a question of Schramm and Wein (2022) and Sohn and Wein (2025). For sparse tensor PCA, we identify an analogous sharp phase transition. For tensor PCA with a general prior, we prove a low-degree estimation lower bound at the critical signal scale, matching the degree--signal tradeoff suggested by prior work. Our lower bounds apply to degree $D=n^δ$, where $n$ is the dimension and $δ>0$ is a constant, and we complement them with corresponding low-degree upper bounds. In addition, for planted dense subhypergraph and sparse tensor PCA above the $\sqrt{n}$ scale, we convert our upper bounds into polynomial-time algorithms that achieve almost exact recovery above the sharp threshold, yielding polynomial-time algorithms succeeding up to this threshold. Our proofs extend the framework of Sohn and Wein (2025) through a conditional variant that yields the correct signal-to-noise ratio in settings where the unconditional approach is insufficient.

85.5PRApr 7
Fast mixing in Ising models with a negative spectral outlier via Gaussian approximation

Dan Mikulincer, Youngtak Sohn

We study the mixing time of Glauber dynamics for Ising models in which the interaction matrix contains a single negative spectral outlier. This class includes the anti-ferromagnetic Curie-Weiss model, the anti-ferromagnetic Ising model on expander graphs, and the Sherrington-Kirkpatrick model with disorder of negative mean. Existing approaches to rapid mixing rely crucially on log-concavity or spectral width bounds and therefore can break down in the presence of a negative outlier. To address this difficulty, we develop a new covariance approximation method based on Gaussian approximation. This method is implemented via an iterative application of Stein's method to quadratic tilts of sums of bounded random variables, which may be of independent interest. The resulting analysis provides an operator-norm control of the full correlation structure under arbitrary external fields. Combined with the localization schemes of Eldan and Chen, these estimates lead to a modified logarithmic Sobolev inequality and near-optimal mixing time bounds in regimes where spectral width bounds fail. We complement these results by proving exponential lower bounds on the mixing time for low temperature anti-ferromagnetic Ising models on sparse random regular graphs and Erdös-Rényi graphs, based on the existence of gapped states as in the recent work of Sellke.

STNov 5, 2019
The generalization error of max-margin linear classifiers: Benign overfitting and high dimensional asymptotics in the overparametrized regime

Andrea Montanari, Feng Ruan, Youngtak Sohn et al.

Modern machine learning classifiers often exhibit vanishing classification error on the training set. They achieve this by learning nonlinear representations of the inputs that maps the data into linearly separable classes. Motivated by these phenomena, we revisit high-dimensional maximum margin classification for linearly separable data. We consider a stylized setting in which data $(y_i,{\boldsymbol x}_i)$, $i\le n$ are i.i.d. with ${\boldsymbol x}_i\sim\mathsf{N}({\boldsymbol 0},{\boldsymbol Σ})$ a $p$-dimensional Gaussian feature vector, and $y_i \in\{+1,-1\}$ a label whose distribution depends on a linear combination of the covariates $\langle {\boldsymbol θ}_*,{\boldsymbol x}_i \rangle$. While the Gaussian model might appear extremely simplistic, universality arguments can be used to show that the results derived in this setting also apply to the output of certain nonlinear featurization maps. We consider the proportional asymptotics $n,p\to\infty$ with $p/n\to ψ$, and derive exact expressions for the limiting generalization error. We use this theory to derive two results of independent interest: $(i)$ Sufficient conditions on $({\boldsymbol Σ},{\boldsymbol θ}_*)$ for `benign overfitting' that parallel previously derived conditions in the case of linear regression; $(ii)$ An asymptotically exact expression for the generalization error when max-margin classification is used in conjunction with feature vectors produced by random one-layer neural networks.