Cheng Mao

ST
14papers
411citations
Novelty60%
AI Score29

14 Papers

STFeb 20, 2023
Sharp analysis of EM for learning mixtures of pairwise differences

Abhishek Dhawan, Cheng Mao, Ashwin Pananjady

We consider a symmetric mixture of linear regressions with random samples from the pairwise comparison design, which can be seen as a noisy version of a type of Euclidean distance geometry problem. We analyze the expectation-maximization (EM) algorithm locally around the ground truth and establish that the sequence converges linearly, providing an $\ell_\infty$-norm guarantee on the estimation error of the iterates. Furthermore, we show that the limit of the EM sequence achieves the sharp rate of estimation in the $\ell_2$-norm, matching the information-theoretically optimal constant. We also argue through simulation that convergence from a random initialization is much more delicate in this setting, and does not appear to occur in general. Our results show that the EM algorithm can exhibit several unique behaviors when the covariate distribution is suitably structured.

STOct 22, 2021
Testing network correlation efficiently via counting trees

Cheng Mao, Yihong Wu, Jiaming Xu et al.

We propose a new procedure for testing whether two networks are edge-correlated through some latent vertex correspondence. The test statistic is based on counting the co-occurrences of signed trees for a family of non-isomorphic trees. When the two networks are Erdős-Rényi random graphs $\mathcal{G}(n,q)$ that are either independent or correlated with correlation coefficient $ρ$, our test runs in $n^{2+o(1)}$ time and succeeds with high probability as $n\to\infty$, provided that $n\min\{q,1-q\} \ge n^{-o(1)}$ and $ρ^2>α\approx 0.338$, where $α$ is Otter's constant so that the number of unlabeled trees with $K$ edges grows as $(1/α)^K$. This significantly improves the prior work in terms of statistical accuracy, running time, and graph sparsity.

STOct 11, 2021
Exact Matching of Random Graphs with Constant Correlation

Cheng Mao, Mark Rudelson, Konstantin Tikhomirov

This paper deals with the problem of graph matching or network alignment for Erdős--Rényi graphs, which can be viewed as a noisy average-case version of the graph isomorphism problem. Let $G$ and $G'$ be $G(n, p)$ Erdős--Rényi graphs marginally, identified with their adjacency matrices. Assume that $G$ and $G'$ are correlated such that $\mathbb{E}[G_{ij} G'_{ij}] = p(1-α)$. For a permutation $π$ representing a latent matching between the vertices of $G$ and $G'$, denote by $G^π$ the graph obtained from permuting the vertices of $G$ by $π$. Observing $G^π$ and $G'$, we aim to recover the matching $π$. In this work, we show that for every $\varepsilon \in (0,1]$, there is $n_0>0$ depending on $\varepsilon$ and absolute constants $α_0, R > 0$ with the following property. Let $n \ge n_0$, $(1+\varepsilon) \log n \le np \le n^{\frac{1}{R \log \log n}}$, and $0 < α< \min(α_0,\varepsilon/4)$. There is a polynomial-time algorithm $F$ such that $\mathbb{P}\{F(G^π,G')=π\}=1-o(1)$. This is the first polynomial-time algorithm that recovers the exact matching between vertices of correlated Erdős--Rényi graphs with constant correlation with high probability. The algorithm is based on comparison of partition trees associated with the graph vertices.

STMay 31, 2021
Optimal Spectral Recovery of a Planted Vector in a Subspace

Cheng Mao, Alexander S. Wein

Recovering a planted vector $v$ in an $n$-dimensional random subspace of $\mathbb{R}^N$ is a generic task related to many problems in machine learning and statistics, such as dictionary learning, subspace recovery, principal component analysis, and non-Gaussian component analysis. In this work, we study computationally efficient estimation and detection of a planted vector $v$ whose $\ell_4$ norm differs from that of a Gaussian vector with the same $\ell_2$ norm. For instance, in the special case where $v$ is an $N ρ$-sparse vector with Bernoulli-Gaussian or Bernoulli-Rademacher entries, our results include the following: (1) We give an improved analysis of a slight variant of the spectral method proposed by Hopkins, Schramm, Shi, and Steurer (2016), showing that it approximately recovers $v$ with high probability in the regime $n ρ\ll \sqrt{N}$. This condition subsumes the conditions $ρ\ll 1/\sqrt{n}$ or $n \sqrtρ \lesssim \sqrt{N}$ required by previous work up to polylogarithmic factors. We achieve $\ell_\infty$ error bounds for the spectral estimator via a leave-one-out analysis, from which it follows that a simple thresholding procedure exactly recovers $v$ with Bernoulli-Rademacher entries, even in the dense case $ρ= 1$. (2) We study the associated detection problem and show that in the regime $n ρ\gg \sqrt{N}$, any spectral method from a large class (and more generally, any low-degree polynomial of the input) fails to detect the planted vector. This matches the condition for recovery and offers evidence that no polynomial-time algorithm can succeed in recovering a Bernoulli-Gaussian vector $v$ when $n ρ\gg \sqrt{N}$.

DSJan 28, 2021
Random Graph Matching with Improved Noise Robustness

Cheng Mao, Mark Rudelson, Konstantin Tikhomirov

Graph matching, also known as network alignment, refers to finding a bijection between the vertex sets of two given graphs so as to maximally align their edges. This fundamental computational problem arises frequently in multiple fields such as computer vision and biology. Recently, there has been a plethora of work studying efficient algorithms for graph matching under probabilistic models. In this work, we propose a new algorithm for graph matching: Our algorithm associates each vertex with a signature vector using a multistage procedure and then matches a pair of vertices from the two graphs if their signature vectors are close to each other. We show that, for two Erdős--Rényi graphs with edge correlation $1-α$, our algorithm recovers the underlying matching exactly with high probability when $α\le 1 / (\log \log n)^C$, where $n$ is the number of vertices in each graph and $C$ denotes a positive universal constant. This improves the condition $α\le 1 / (\log n)^C$ achieved in previous work.

STSep 14, 2020
Learning Mixtures of Permutations: Groups of Pairwise Comparisons and Combinatorial Method of Moments

Cheng Mao, Yihong Wu

In applications such as rank aggregation, mixture models for permutations are frequently used when the population exhibits heterogeneity. In this work, we study the widely used Mallows mixture model. In the high-dimensional setting, we propose a polynomial-time algorithm that learns a Mallows mixture of permutations on $n$ elements with the optimal sample complexity that is proportional to $\log n$, improving upon previous results that scale polynomially with $n$. In the high-noise regime, we characterize the optimal dependency of the sample complexity on the noise parameter. Both objectives are accomplished by first studying demixing permutations under a noiseless query model using groups of pairwise comparisons, which can be viewed as moments of the mixing distribution, and then extending these results to the noisy Mallows model by simulating the noiseless oracle.

PRJul 20, 2019
Spectral Graph Matching and Regularized Quadratic Relaxations II: Erdős-Rényi Graphs and Universality

Zhou Fan, Cheng Mao, Yihong Wu et al.

We analyze a new spectral graph matching algorithm, GRAph Matching by Pairwise eigen-Alignments (GRAMPA), for recovering the latent vertex correspondence between two unlabeled, edge-correlated weighted graphs. Extending the exact recovery guarantees established in the companion paper for Gaussian weights, in this work, we prove the universality of these guarantees for a general correlated Wigner model. In particular, for two Erdős-Rényi graphs with edge correlation coefficient $1-σ^2$ and average degree at least $\operatorname{polylog}(n)$, we show that GRAMPA exactly recovers the latent vertex correspondence with high probability when $σ\lesssim 1/\operatorname{polylog}(n)$. Moreover, we establish a similar guarantee for a variant of GRAMPA, corresponding to a tighter quadratic programming relaxation of the quadratic assignment problem. Our analysis exploits a resolvent representation of the GRAMPA similarity matrix and local laws for the resolvents of sparse Wigner matrices.

MLJul 20, 2019
Spectral Graph Matching and Regularized Quadratic Relaxations I: The Gaussian Model

Zhou Fan, Cheng Mao, Yihong Wu et al.

Graph matching aims at finding the vertex correspondence between two unlabeled graphs that maximizes the total edge weight correlation. This amounts to solving a computationally intractable quadratic assignment problem. In this paper we propose a new spectral method, GRAph Matching by Pairwise eigen-Alignments (GRAMPA). Departing from prior spectral approaches that only compare top eigenvectors, or eigenvectors of the same order, GRAMPA first constructs a similarity matrix as a weighted sum of outer products between all pairs of eigenvectors of the two graphs, with weights given by a Cauchy kernel applied to the separation of the corresponding eigenvalues, then outputs a matching by a simple rounding procedure. The similarity matrix can also be interpreted as the solution to a regularized quadratic programming relaxation of the quadratic assignment problem. For the Gaussian Wigner model in which two complete graphs on $n$ vertices have Gaussian edge weights with correlation coefficient $1-σ^2$, we show that GRAMPA exactly recovers the correct vertex correspondence with high probability when $σ= O(\frac{1}{\log n})$. This matches the state of the art of polynomial-time algorithms, and significantly improves over existing spectral methods which require $σ$ to be polynomially small in $n$. The superiority of GRAMPA is also demonstrated on a variety of synthetic and real datasets, in terms of both statistical accuracy and computational efficiency. Universality results, including similar guarantees for dense and sparse Erdős-Rényi graphs, are deferred to the companion paper.

STApr 5, 2019
Estimation of Monge Matrices

Jan-Christian Hütter, Cheng Mao, Philippe Rigollet et al.

Monge matrices and their permuted versions known as pre-Monge matrices naturally appear in many domains across science and engineering. While the rich structural properties of such matrices have long been leveraged for algorithmic purposes, little is known about their impact on statistical estimation. In this work, we propose to view this structure as a shape constraint and study the problem of estimating a Monge matrix subject to additive random noise. More specifically, we establish the minimax rates of estimation of Monge and pre-Monge matrices. In the case of pre-Monge matrices, the minimax-optimal least-squares estimator is not efficiently computable, and we propose two efficient estimators and establish their rates of convergence. Our theoretical findings are supported by numerical experiments.

MLJun 25, 2018
Towards Optimal Estimation of Bivariate Isotonic Matrices with Unknown Permutations

Cheng Mao, Ashwin Pananjady, Martin J. Wainwright

Many applications, including rank aggregation, crowd-labeling, and graphon estimation, can be modeled in terms of a bivariate isotonic matrix with unknown permutations acting on its rows and/or columns. We consider the problem of estimating an unknown matrix in this class, based on noisy observations of (possibly, a subset of) its entries. We design and analyze polynomial-time algorithms that improve upon the state of the art in two distinct metrics, showing, in particular, that minimax optimal, computationally efficient estimation is achievable in certain settings. Along the way, we prove matching upper and lower bounds on the minimax radii of certain cone testing problems, which may be of independent interest.

MLFeb 27, 2018
Breaking the $1/\sqrt{n}$ Barrier: Faster Rates for Permutation-based Models in Polynomial Time

Cheng Mao, Ashwin Pananjady, Martin J. Wainwright

Many applications, including rank aggregation and crowd-labeling, can be modeled in terms of a bivariate isotonic matrix with unknown permutations acting on its rows and columns. We consider the problem of estimating such a matrix based on noisy observations of a subset of its entries, and design and analyze a polynomial-time algorithm that improves upon the state of the art. In particular, our results imply that any such $n \times n$ matrix can be estimated efficiently in the normalized Frobenius norm at rate $\widetilde{\mathcal O}(n^{-3/4})$, thus narrowing the gap between $\widetilde{\mathcal O}(n^{-1})$ and $\widetilde{\mathcal O}(n^{-1/2})$, which were hitherto the rates of the most statistically and computationally efficient methods, respectively.

MLOct 28, 2017
Minimax Rates and Efficient Algorithms for Noisy Sorting

Cheng Mao, Jonathan Weed, Philippe Rigollet

There has been a recent surge of interest in studying permutation-based models for ranking from pairwise comparison data. Despite being structurally richer and more robust than parametric ranking models, permutation-based models are less well understood statistically and generally lack efficient learning algorithms. In this work, we study a prototype of permutation-based ranking models, namely, the noisy sorting model. We establish the optimal rates of learning the model under two sampling procedures. Furthermore, we provide a fast algorithm to achieve near-optimal rates if the observations are sampled independently. Along the way, we discover properties of the symmetric group which are of theoretical interest.

LGJul 19, 2017
Worst-case vs Average-case Design for Estimation from Fixed Pairwise Comparisons

Ashwin Pananjady, Cheng Mao, Vidya Muthukumar et al.

Pairwise comparison data arises in many domains, including tournament rankings, web search, and preference elicitation. Given noisy comparisons of a fixed subset of pairs of items, we study the problem of estimating the underlying comparison probabilities under the assumption of strong stochastic transitivity (SST). We also consider the noisy sorting subclass of the SST model. We show that when the assignment of items to the topology is arbitrary, these permutation-based models, unlike their parametric counterparts, do not admit consistent estimation for most comparison topologies used in practice. We then demonstrate that consistent estimation is possible when the assignment of items to the topology is randomized, thus establishing a dichotomy between worst-case and average-case designs. We propose two estimators in the average-case setting and analyze their risk, showing that it depends on the comparison topology only through the degree sequence of the topology. The rates achieved by these estimators are shown to be optimal for a large class of graphs. Our results are corroborated by simulations on multiple comparison topologies.

STJul 8, 2016
Optimal Rates of Statistical Seriation

Nicolas Flammarion, Cheng Mao, Philippe Rigollet

Given a matrix the seriation problem consists in permuting its rows in such way that all its columns have the same shape, for example, they are monotone increasing. We propose a statistical approach to this problem where the matrix of interest is observed with noise and study the corresponding minimax rate of estimation of the matrices. Specifically, when the columns are either unimodal or monotone, we show that the least squares estimator is optimal up to logarithmic factors and adapts to matrices with a certain natural structure. Finally, we propose a computationally efficient estimator in the monotonic case and study its performance both theoretically and experimentally. Our work is at the intersection of shape constrained estimation and recent work that involves permutation learning, such as graph denoising and ranking.