Provable Estimation of the Number of Blocks in Block Models
This addresses a key limitation in community detection algorithms that require known cluster numbers, benefiting researchers and practitioners in network analysis.
The paper tackles the problem of estimating the number of clusters in community detection for networks without prior knowledge, achieving exact recovery of the number of clusters and clustering matrix with high probability under broad conditions.
Community detection is a fundamental unsupervised learning problem for unlabeled networks which has a broad range of applications. Many community detection algorithms assume that the number of clusters $r$ is known apriori. In this paper, we propose an approach based on semi-definite relaxations, which does not require prior knowledge of model parameters like many existing convex relaxation methods and recovers the number of clusters and the clustering matrix exactly under a broad parameter regime, with probability tending to one. On a variety of simulated and real data experiments, we show that the proposed method often outperforms state-of-the-art techniques for estimating the number of clusters.