DSMay 29
High-Dimensional Expanders, the Sparsest Cut Problem, and Steurer's ConjectureFarzam Ebrahimnejad, Shayan Oveis Gharan
In 2010, Steurer conjectured that any family of $n$ unit-norm vectors $v_1,\dots,v_n$ with polynomially small average correlation $\mathbb{E}_{i,j}|\langle v_i,v_j\rangle|\leq n^{-ε}$ contains linear-sized constant-separated sets. We refute this conjecture in a strong sense using the machinery of sparse high-dimensional expanders: such vector families do not even have linear-sized $\frac{1}{\log^{1/4-o(1)}(n)}$-separated sets. Consequently, we show that there are families of vertex expanders on $n$ vertices for which the (average) $L_2$-mixing time to the uniform distribution of any reweighted simple random walk is at least $\log^{5/4-o(1)} n$.
DSMay 31
On Thin Perfect Matchings up to Polylogarithmic FactorsAlireza Haqi, Shayan Oveis Gharan
We resolve the thin matching problem proposed by Anari, Charikar and Ramakrishnan [ACR23] up to polylogarithmic factors. Given a fractional perfect matching $x$, we say a perfect matching $M$ is $α$-thin w.r.t. $x$ if for any cut $(S,\overline{S})$, we have $$ |M \cap E(S,\overline{S})| \leq α\cdot x(S,\overline{S}).$$ [ACR23] conjectured that for any fractional perfect matching $x$, there exists a perfect matching $M$ which is $O(1)$-thin w.r.t. $x$. First, we show that if $M$ is restricted to be in the support of $x$, then $α\geq Ω(n)$ and we complement this by designing an efficient algorithm that outputs an $O(n\log n)$-thin perfect matching where $n$ is the number of vertices. Then, we relax this constraint and show that for any fractional perfect matching $x$, there is a perfect matching $M$ (which is not necessarily in the support of $x$) such that $M$ is $\text{polylog}(n)$-thin w.r.t. $x$. All results work for both bipartite and non-bipartite graphs. We also discuss applications to the metric distortion problem.
COApr 2
Trickle-down Theorems via C-Lorentzian Polynomials II: Pairwise Spectral Influence and Improved Dobrushin's ConditionJonathan Leake, Shayan Oveis Gharan
Let $μ$ be a probability distribution on a multi-state spin system on a set $V$ of sites; equivalently, a $d$-partite simplicial complex with distribution $μ$ on maximal faces. For any pair of vertices $u,v\in V$, define the pairwise spectral influence $\mathcal{I}_{u,v}$ as follows. Let $Ï$ be a choice of spins $s_w\in S_w$ for every $w\in V\setminus\{u,v\}$, and construct a matrix in $\mathbb{R}^{(S_u\cup S_v)\times (S_u\cup S_v)}$ where for any $s_u\in S_u, s_v\in S_v$, the $(us_u,vs_v)$-entry is the probability that $s_v$ is the spin of $v$ conditioned on $s_u$ being the spin of $u$ and on $Ï$. Then $\mathcal{I}_{u,v}$ is the maximal second eigenvalue of this matrix, over all choices of spins for all $w\in V\setminus\{u,v\}$. Equivalently, $\mathcal{I}_{u,v}$ is the maximum local spectral expansion of links of codimension $2$ that include a spin for every $w \in V \setminus \{u,v\}$. We show that if the largest eigenvalue of the pairwise spectral influence matrix with entries $\mathcal{I}_{u,v}$ is bounded away from 1, i.e. $λ_{\max}(\mathcal{I})\leq 1-ε$ (and $X$ is connected), then the Glauber dynamics mixes rapidly and generate samples from $μ$. This improves/generalizes the classical Dobrushin's influence matrix as the $\mathcal{I}_{u,v}$ lower-bounds the classical influence of $u\to v$. As an application, we prove that the Glauber dynamics mixes rapidly up to (approximately) the phase transition for the multi-state hardcore model--a widely studied model in telecommunication networks and statistical physics (generalizing the hardcore model) introduced by Mazel and Suhov. As a by-product of our results, we also prove improved/almost optimal trickle-down theorems for partite simplicial complexes. Our proof builds on the trickle-down theorems via $\mathcal{C}$-Lorentzian polynomials machinery recently developed by the authors and Lindberg.
DSApr 23
Sampling from the Hardcore Model on Random Regular Bipartite Graphs above the Uniqueness ThresholdNicholas Kocurek, Shayan Oveis Gharan, Dante Tjowasi
We design an efficient sampling algorithm to generate samples from the hardcore model on random regular bipartite graphs as long as $λ\lesssim \frac{1}{\sqrtΔ}$, where $Δ$ is the degree. Combined with recent work of Jenssen, Keevash and Perkins this implies an FPRAS for the partition function of the hardcore model on random regular bipartite graphs at any fugacity. Our algorithm is shown by analyzing two new Markov chains that work in complementary regimes. Our proof then proceeds by showing the corresponding simplicial complexes are top-link spectral expanders and appealing to the trickle-down theorem to prove fast mixing.
DSJul 6, 2019
Composable Core-sets for Determinant Maximization: A Simple Near-Optimal AlgorithmPiotr Indyk, Sepideh Mahabadi, Shayan Oveis Gharan et al.
``Composable core-sets'' are an efficient framework for solving optimization problems in massive data models. In this work, we consider efficient construction of composable core-sets for the determinant maximization problem. This can also be cast as the MAP inference task for determinantal point processes, that have recently gained a lot of interest for modeling diversity and fairness. The problem was recently studied in [IMOR'18], where they designed composable core-sets with the optimal approximation bound of $\tilde O(k)^k$. On the other hand, the more practical Greedy algorithm has been previously used in similar contexts. In this work, first we provide a theoretical approximation guarantee of $O(C^{k^2})$ for the Greedy algorithm in the context of composable core-sets; Further, we propose to use a Local Search based algorithm that while being still practical, achieves a nearly optimal approximation bound of $O(k)^{2k}$; Finally, we implement all three algorithms and show the effectiveness of our proposed algorithm on standard data sets.
LGOct 20, 2018
A Polynomial Time MCMC Method for Sampling from Continuous DPPsShayan Oveis Gharan, Alireza Rezaei
We study the Gibbs sampling algorithm for continuous determinantal point processes. We show that, given a warm start, the Gibbs sampler generates a random sample from a continuous $k$-DPP defined on a $d$-dimensional domain by only taking $\text{poly}(k)$ number of steps. As an application, we design an algorithm to generate random samples from $k$-DPPs defined by a spherical Gaussian kernel on a unit sphere in $d$-dimensions, $\mathbb{S}^{d-1}$ in time polynomial in $k,d$.
DSJul 31, 2018
Composable Core-sets for Determinant Maximization Problems via Spectral SpannersPiotr Indyk, Sepideh Mahabadi, Shayan Oveis Gharan et al.
We study a spectral generalization of classical combinatorial graph spanners to the spectral setting. Given a set of vectors $V\subseteq \Re^d$, we say a set $U\subseteq V$ is an $α$-spectral spanner if for all $v\in V$ there is a probability distribution $μ_v$ supported on $U$ such that $$vv^\intercal \preceq α\cdot\mathbb{E}_{u\simμ_v} uu^\intercal.$$ We show that any set $V$ has an $\tilde{O}(d)$-spectral spanner of size $\tilde{O}(d)$ and this bound is almost optimal in the worst case. We use spectral spanners to study composable core-sets for spectral problems. We show that for many objective functions one can use a spectral spanner, independent of the underlying functions, as a core-set and obtain almost optimal composable core-sets. For example, for the determinant maximization problem we obtain an $\tilde{O}(k)^k$-composable core-set and we show that this is almost optimal in the worst case. Our algorithm is a spectral analogue of the classical greedy algorithm for finding (combinatorial) spanners in graphs. We expect that our spanners find many other applications in distributed or parallel models of computation. Our proof is spectral. As a side result of our techniques, we show that the rank of diagonally dominant lower-triangular matrices are robust under `small perturbations' which could be of independent interests.
LGAug 8, 2017
Time-Space Tradeoffs for Learning from Small Test Spaces: Learning Low Degree Polynomial FunctionsPaul Beame, Shayan Oveis Gharan, Xin Yang
We develop an extension of recently developed methods for obtaining time-space tradeoff lower bounds for problems of learning from random test samples to handle the situation where the space of tests is signficantly smaller than the space of inputs, a class of learning problems that is not handled by prior work. This extension is based on a measure of how matrices amplify the 2-norms of probability distributions that is more refined than the 2-norms of these matrices. As applications that follow from our new technique, we show that any algorithm that learns $m$-variate homogeneous polynomial functions of degree at most $d$ over $\mathbb{F}_2$ from evaluations on randomly chosen inputs either requires space $Ω(mn)$ or $2^{Ω(m)}$ time where $n=m^{Θ(d)}$ is the dimension of the space of such functions. These bounds are asymptotically optimal since they match the tradeoffs achieved by natural learning algorithms for the problems.
LGFeb 16, 2016
Monte Carlo Markov Chain Algorithms for Sampling Strongly Rayleigh Distributions and Determinantal Point ProcessesNima Anari, Shayan Oveis Gharan, Alireza Rezaei
Strongly Rayleigh distributions are natural generalizations of product and determinantal probability distributions and satisfy strongest form of negative dependence properties. We show that the "natural" Monte Carlo Markov Chain (MCMC) is rapidly mixing in the support of a {\em homogeneous} strongly Rayleigh distribution. As a byproduct, our proof implies Markov chains can be used to efficiently generate approximate samples of a $k$-determinantal point process. This answers an open question raised by Deshpande and Rademacher.
DSSep 12, 2013
Partitioning into ExpandersShayan Oveis Gharan, Luca Trevisan
Let G=(V,E) be an undirected graph, lambda_k be the k-th smallest eigenvalue of the normalized laplacian matrix of G. There is a basic fact in algebraic graph theory that lambda_k > 0 if and only if G has at most k-1 connected components. We prove a robust version of this fact. If lambda_k>0, then for some 1\leq \ell\leq k-1, V can be {\em partitioned} into l sets P_1,\ldots,P_l such that each P_i is a low-conductance set in G and induces a high conductance induced subgraph. In particular, φ(P_i)=O(l^3\sqrt{λ_l}) and φ(G[P_i]) >= λ_k/k^2). We make our results algorithmic by designing a simple polynomial time spectral algorithm to find such partitioning of G with a quadratic loss in the inside conductance of P_i's. Unlike the recent results on higher order Cheeger's inequality [LOT12,LRTV12], our algorithmic results do not use higher order eigenfunctions of G. If there is a sufficiently large gap between lambda_k and lambda_{k+1}, more precisely, if λ_{k+1} >= \poly(k) lambda_{k}^{1/4} then our algorithm finds a k partitioning of V into sets P_1,...,P_k such that the induced subgraph G[P_i] has a significantly larger conductance than the conductance of P_i in G. Such a partitioning may represent the best k clustering of G. Our algorithm is a simple local search that only uses the Spectral Partitioning algorithm as a subroutine. We expect to see further applications of this simple algorithm in clustering applications.
DSJan 23, 2013
Improved Cheeger's Inequality: Analysis of Spectral Partitioning Algorithms through Higher Order Spectral GapTsz Chiu Kwok, Lap Chi Lau, Yin Tat Lee et al.
Let φ(G) be the minimum conductance of an undirected graph G, and let 0=λ_1 <= λ_2 <=... <= λ_n <= 2 be the eigenvalues of the normalized Laplacian matrix of G. We prove that for any graph G and any k >= 2, φ(G) = O(k) λ_2 / \sqrt{λ_k}, and this performance guarantee is achieved by the spectral partitioning algorithm. This improves Cheeger's inequality, and the bound is optimal up to a constant factor for any k. Our result shows that the spectral partitioning algorithm is a constant factor approximation algorithm for finding a sparse cut if λ_k$ is a constant for some constant k. This provides some theoretical justification to its empirical performance in image segmentation and clustering problems. We extend the analysis to other graph partitioning problems, including multi-way partition, balanced separator, and maximum cut.