Thomas P. Hayes

CG
4papers
35citations
Novelty69%
AI Score28

4 Papers

CGJul 29, 2021
Reconstruction of Random Geometric Graphs: Breaking the Omega(r) distortion barrier

Varsha Dani, Josep Díaz, Thomas P. Hayes et al.

Embedding graphs in a geographical or latent space, i.e.\ inferring locations for vertices in Euclidean space or on a smooth manifold or submanifold, is a common task in network analysis, statistical inference, and graph visualization. We consider the classic model of random geometric graphs where $n$ points are scattered uniformly in a square of area $n$, and two points have an edge between them if and only if their Euclidean distance is less than $r$. The reconstruction problem then consists of inferring the vertex positions, up to the symmetries of the square, given only the adjacency matrix of the resulting graph. We give an algorithm that, if $r=n^α$ for any $α> 0$, with high probability reconstructs the vertex positions with a maximum error of $O(n^β)$ where $β=1/2-(4/3)α$, until $α\ge 3/8$ where $β=0$ and the error becomes $O(\sqrt{\log n})$. This improves over earlier results, which were unable to reconstruct with error less than $r$. Our method estimates Euclidean distances using a hybrid of graph distances and short-range estimates based on the number of common neighbors. We extend our results to the surface of the sphere in $\R^3$ and to hypercubes in any constant fixed dimension. Additionally we examine the extent to which reconstruction is still possible when the original adjacency lists have had a subset of the edges independently deleted at random.

DSApr 1, 2019
Distributed Metropolis Sampler with Optimal Parallelism

Weiming Feng, Thomas P. Hayes, Yitong Yin

The Metropolis-Hastings algorithm is a fundamental Markov chain Monte Carlo (MCMC) method for sampling and inference. With the advent of Big Data, distributed and parallel variants of MCMC methods are attracting increased attention. In this paper, we give a distributed algorithm that can correctly simulate sequential single-site Metropolis chains without any bias in a fully asynchronous message-passing model. Furthermore, if a natural Lipschitz condition is satisfied by the Metropolis filters, our algorithm can simulate $N$-step Metropolis chains within $O(N/n+\log n)$ rounds of asynchronous communications, where $n$ is the number of variables. For sequential single-site dynamics, whose mixing requires $Ω(n\log n)$ steps, this achieves an optimal linear speedup. For several well-studied important graphical models, including proper graph coloring, hardcore model, and Ising model, our condition for linear speedup is weaker than the respective uniqueness (mixing) conditions. The novel idea in our algorithm is to resolve updates in advance: the local Metropolis filters can often be executed correctly before the full information about neighboring spins is available. This achieves optimal parallelism without introducing any bias.

CRDec 18, 2016
Distributed Computing with Channel Noise

Abhinav Aggarwal, Varsha Dani, Thomas P. Hayes et al.

A group of $n$ users want to run a distributed protocol $π$ over a network where communication occurs via private point-to-point channels. Unfortunately, an adversary, who knows $π$, is able to maliciously flip bits on the channels. Can we efficiently simulate $π$ in the presence of such an adversary? We show that this is possible, even when $L$, the number of bits sent in $π$, and $T$, the number of bits flipped by the adversary are not known in advance. In particular, we show how to create a robust version of $π$ that 1) fails with probability at most $δ$, for any $δ>0$; and 2) sends $\tilde{O}(L + T)$ bits, where the $\tilde{O}$ notation hides a $\log (nL/ δ)$ term multiplying $L$. Additionally, we show how to improve this result when the average message size $α$ is not constant. In particular, we give an algorithm that sends $O( L (1 + (1/α) \log (n L/δ) + T)$ bits. This algorithm is adaptive in that it does not require a priori knowledge of $α$. We note that if $α$ is $Ω\left( \log (n L/δ) \right)$, then this improved algorithm sends only $O(L+T)$ bits, and is therefore within a constant factor of optimal.

LGJan 15, 2013
Block Coordinate Descent for Sparse NMF

Vamsi K. Potluru, Sergey M. Plis, Jonathan Le Roux et al.

Nonnegative matrix factorization (NMF) has become a ubiquitous tool for data analysis. An important variant is the sparse NMF problem which arises when we explicitly require the learnt features to be sparse. A natural measure of sparsity is the L$_0$ norm, however its optimization is NP-hard. Mixed norms, such as L$_1$/L$_2$ measure, have been shown to model sparsity robustly, based on intuitive attributes that such measures need to satisfy. This is in contrast to computationally cheaper alternatives such as the plain L$_1$ norm. However, present algorithms designed for optimizing the mixed norm L$_1$/L$_2$ are slow and other formulations for sparse NMF have been proposed such as those based on L$_1$ and L$_0$ norms. Our proposed algorithm allows us to solve the mixed norm sparsity constraints while not sacrificing computation time. We present experimental evidence on real-world datasets that shows our new algorithm performs an order of magnitude faster compared to the current state-of-the-art solvers optimizing the mixed norm and is suitable for large-scale datasets.