Bayesian estimation of the latent dimension and communities in stochastic blockmodels
This addresses a practical limitation in network analysis for researchers and practitioners, though it is an incremental improvement over existing spectral methods.
The authors tackled the problem of needing to pre-specify the number of communities and latent dimension in spectral embedding for community detection in networks, proposing a Bayesian model that automatically selects these parameters and showing promising performance in simulations and real-world data.
Spectral embedding of adjacency or Laplacian matrices of undirected graphs is a common technique for representing a network in a lower dimensional latent space, with optimal theoretical guarantees. The embedding can be used to estimate the community structure of the network, with strong consistency results in the stochastic blockmodel framework. One of the main practical limitations of standard algorithms for community detection from spectral embeddings is that the number of communities and the latent dimension of the embedding must be specified in advance. In this article, a novel Bayesian model for simultaneous and automatic selection of the appropriate dimension of the latent space and the number of blocks is proposed. Extensions to directed and bipartite graphs are discussed. The model is tested on simulated and real world network data, showing promising performance for recovering latent community structure.