LGSep 21, 2022
Approximate sampling and estimation of partition functions using neural networksGeorge T. Cantwell
We consider the closely related problems of sampling from a distribution known up to a normalizing constant, and estimating said normalizing constant. We show how variational autoencoders (VAEs) can be applied to this task. In their standard applications, VAEs are trained to fit data drawn from an intractable distribution. We invert the logic and train the VAE to fit a simple and tractable distribution, on the assumption of a complex and intractable latent distribution, specified up to normalization. This procedure constructs approximations without the use of training data or Markov chain Monte Carlo sampling. We illustrate our method on three examples: the Ising model, graph clustering, and ranking.
SIDec 2, 2025
Embedding networks with the random walk first return time distributionVedanta Thapar, Renaud Lambiotte, George T. Cantwell
We propose the first return time distribution (FRTD) of a random walk as an interpretable and mathematically grounded node embedding. The FRTD assigns a probability mass function to each node, allowing us to define a distance between any pair of nodes using standard metrics for discrete distributions. We present several arguments to motivate the FRTD embedding. First, we show that FRTDs are strictly more informative than eigenvalue spectra, yet insufficient for complete graph identification, thus placing FRTD equivalence between cospectrality and isomorphism. Second, we argue that FRTD equivalence between nodes captures structural similarity. Third, we empirically demonstrate that the FRTD embedding outperforms manually designed graph metrics in network alignment tasks. Finally, we show that random networks that approximately match the FRTD of a desired target also preserve other salient features. Together these results demonstrate the FRTD as a simple and mathematically principled embedding for complex networks.
SIDec 17, 2025
Model inference for ranking from pairwise comparisonsDaniel Sánchez Catalina, George T. Cantwell
We consider the problem of ranking objects from noisy pairwise comparisons, for example, ranking tennis players from the outcomes of matches. We follow a standard approach to this problem and assume that each object has an unobserved strength and that the outcome of each comparison depends probabilistically on the strengths of the comparands. However, we do not assume to know a priori how skills affect outcomes. Instead, we present an efficient algorithm for simultaneously inferring both the unobserved strengths and the function that maps strengths to probabilities. Despite this problem being under-constrained, we present experimental evidence that the conclusions of our Bayesian approach are robust to different model specifications. We include several case studies to exemplify the method on real-world data sets.
AIOct 1, 2021
Belief propagation for permutations, rankings, and partial ordersGeorge T. Cantwell, Cristopher Moore
Many datasets give partial information about an ordering or ranking by indicating which team won a game, which item a user prefers, or who infected whom. We define a continuous spin system whose Gibbs distribution is the posterior distribution on permutations, given a probabilistic model of these interactions. Using the cavity method we derive a belief propagation algorithm that computes the marginal distribution of each node's position. In addition, the Bethe free energy lets us approximate the number of linear extensions of a partial order and perform model selection between competing probabilistic models, such as the Bradley-Terry-Luce model of noisy comparisons and its cousins.
STAT-MECHSep 23, 2020
Belief propagation for networks with loopsAlec Kirkley, George T. Cantwell, M. E. J. Newman
Belief propagation is a widely used message passing method for the solution of probabilistic models on networks such as epidemic models, spin models, and Bayesian graphical models, but it suffers from the serious shortcoming that it works poorly in the common case of networks that contain short loops. Here we provide a solution to this long-standing problem, deriving a belief propagation method that allows for fast calculation of probability distributions in systems with short loops, potentially with high density, as well as giving expressions for the entropy and partition function, which are notoriously difficult quantities to compute. Using the Ising model as an example, we show that our approach gives excellent results on both real and synthetic networks, improving significantly on standard message passing methods. We also discuss potential applications of our method to a variety of other problems.
SIJul 29, 2019
Improved mutual information measure for classification and community detectionM. E. J. Newman, George T. Cantwell, Jean-Gabriel Young
The information theoretic quantity known as mutual information finds wide use in classification and community detection analyses to compare two classifications of the same set of objects into groups. In the context of classification algorithms, for instance, it is often used to compare discovered classes to known ground truth and hence to quantify algorithm performance. Here we argue that the standard mutual information, as commonly defined, omits a crucial term which can become large under real-world conditions, producing results that can be substantially in error. We demonstrate how to correct this error and define a mutual information that works in all cases. We discuss practical implementation of the new measure and give some example applications.