Revisiting Instance-Optimal Cluster Recovery in the Labeled Stochastic Block Model
This work addresses community detection in large-scale networks, providing a scalable and parameter-free algorithm, though it is incremental as it builds on existing spectral clustering methods.
The paper tackles the problem of recovering hidden communities in the Labeled Stochastic Block Model with linearly growing cluster sizes, deriving necessary and sufficient conditions for misclassification error and proposing the IAC algorithm that matches instance-specific lower bounds with computational complexity of O(n polylog(n)).
In this paper, we investigate the problem of recovering hidden communities in the Labeled Stochastic Block Model (LSBM) with a finite number of clusters whose sizes grow linearly with the total number of nodes. We derive the necessary and sufficient conditions under which the expected number of misclassified nodes is less than $ s $, for any number $ s = o(n) $. To achieve this, we propose IAC (Instance-Adaptive Clustering), the first algorithm whose performance matches the instance-specific lower bounds both in expectation and with high probability. IAC is a novel two-phase algorithm that consists of a one-shot spectral clustering step followed by iterative likelihood-based cluster assignment improvements. This approach is based on the instance-specific lower bound and notably does not require any knowledge of the model parameters, including the number of clusters. By performing the spectral clustering only once, IAC maintains an overall computational complexity of $ \mathcal{O}(n\, \text{polylog}(n)) $, making it scalable and practical for large-scale problems.