88.8DSMay 31
Constant-Stretch Rounding on the HypersimplexNima Anari, Alireza Haqi, Eric Ma
We study correlated rounding on the hypersimplex, the base polytope of the uniform matroid. For each point $x$ in the hypersimplex, the goal is to sample a $k$-subset $A(x)$ with marginals $x$, while coupling the samples for all choices of $x$ so that nearby inputs produce nearby sets. We give a constant-stretch scheme. Our scheme samples the maximum-entropy $k$-subset distribution with prescribed marginals using a common random ordering and common uniform thresholds. For every $x,y\in[0,1]^n$ with $\sum_i x_i=\sum_i y_i=k$, it satisfies $\mathbb{E}[|A(x)\triangle A(y)|]\le 6\|x-y\|_1$. This improves the previous $O(\log k)$ bound for hypersimplex correlated rounding and answers an open question raised by Naor, Raju, Shetty, Srinivasan, Valieva, and Wajc. By adding dummy coordinates, the same result gives stretch at most $12$ for the at-most-$k$ polytope. The proof was found in a GPT 5.5 Pro Extended conversation prompted by the authors, and Codex was used to help assemble the manuscript under the authors' supervision.
76.4DSMay 31
On Thin Perfect Matchings up to Polylogarithmic FactorsAlireza Haqi, Shayan Oveis Gharan
We resolve the thin matching problem proposed by Anari, Charikar and Ramakrishnan [ACR23] up to polylogarithmic factors. Given a fractional perfect matching $x$, we say a perfect matching $M$ is $α$-thin w.r.t. $x$ if for any cut $(S,\overline{S})$, we have $$ |M \cap E(S,\overline{S})| \leq α\cdot x(S,\overline{S}).$$ [ACR23] conjectured that for any fractional perfect matching $x$, there exists a perfect matching $M$ which is $O(1)$-thin w.r.t. $x$. First, we show that if $M$ is restricted to be in the support of $x$, then $α\geq Ω(n)$ and we complement this by designing an efficient algorithm that outputs an $O(n\log n)$-thin perfect matching where $n$ is the number of vertices. Then, we relax this constraint and show that for any fractional perfect matching $x$, there is a perfect matching $M$ (which is not necessarily in the support of $x$) such that $M$ is $\text{polylog}(n)$-thin w.r.t. $x$. All results work for both bipartite and non-bipartite graphs. We also discuss applications to the metric distortion problem.
39.3DSMar 26
Fast Spanning Tree Sampling in Broadcast Congested CliqueNima Anari, Alireza Haqi
We present the first polylogarithmic-round algorithm for sampling a random spanning tree in the (Broadcast) Congested Clique model. For any constant $c > 0$, our algorithm outputs a sample from a distribution whose total variation distance from the uniform spanning tree distribution is at most $O(n^{-c})$ in at most $c \cdot \log^{O(1)}(n)$ rounds. The exponent hidden in $\log^{O(1)}(n)$ is an absolute constant independent of $c$ and $n$. This is an exponential improvement over the previous best algorithm of Pemmaraju, Roy, and Sobel (PODC 2025) for the Congested Clique model.
DSNov 11, 2025
Parallel Sampling via AutospeculationNima Anari, Carlo Baronio, CJ Chen et al.
We present parallel algorithms to accelerate sampling via counting in two settings: any-order autoregressive models and denoising diffusion models. An any-order autoregressive model accesses a target distribution $μ$ on $[q]^n$ through an oracle that provides conditional marginals, while a denoising diffusion model accesses a target distribution $μ$ on $\mathbb{R}^n$ through an oracle that provides conditional means under Gaussian noise. Standard sequential sampling algorithms require $\widetilde{O}(n)$ time to produce a sample from $μ$ in either setting. We show that, by issuing oracle calls in parallel, the expected sampling time can be reduced to $\widetilde{O}(n^{1/2})$. This improves the previous $\widetilde{O}(n^{2/3})$ bound for any-order autoregressive models and yields the first parallel speedup for diffusion models in the high-accuracy regime, under the relatively mild assumption that the support of $μ$ is bounded. We introduce a novel technique to obtain our results: speculative rejection sampling. This technique leverages an auxiliary ``speculative'' distribution~$ν$ that approximates~$μ$ to accelerate sampling. Our technique is inspired by the well-studied ``speculative decoding'' techniques popular in large language models, but differs in key ways. Firstly, we use ``autospeculation,'' namely we build the speculation $ν$ out of the same oracle that defines~$μ$. In contrast, speculative decoding typically requires a separate, faster, but potentially less accurate ``draft'' model $ν$. Secondly, the key differentiating factor in our technique is that we make and accept speculations at a ``sequence'' level rather than at the level of single (or a few) steps. This last fact is key to unlocking our parallel runtime of $\widetilde{O}(n^{1/2})$.