Estimating Mixed Memberships with Sharp Eigenvector Deviations
This work addresses the need for efficient and accurate overlapping community detection in networks, which is incremental by improving upon existing methods with sharper bounds and faster computation.
The paper tackles the problem of estimating overlapping community memberships in networks by providing the first per-node convergence rates for the Mixed Membership Stochastic Blockmodel, achieving better error with lower variability in experiments on datasets up to 100,000 nodes.
We consider the problem of estimating community memberships of nodes in a network, where every node is associated with a vector determining its degree of membership in each community. Existing provably consistent algorithms often require strong assumptions about the population, are computationally expensive, and only provide an overall error bound for the whole community membership matrix. This paper provides uniform rates of convergence for the inferred community membership vector of each node in a network generated from the Mixed Membership Stochastic Blockmodel (MMSB); to our knowledge, this is the first work to establish per-node rates for overlapping community detection in networks. We achieve this by establishing sharp row-wise eigenvector deviation bounds for MMSB. Based on the simplex structure inherent in the eigen-decomposition of the population matrix, we build on established corner-finding algorithms from the optimization community to infer the community membership vectors. Our results hold over a broad parameter regime where the average degree only grows poly-logarithmically with the number of nodes. Using experiments with simulated and real datasets, we show that our method achieves better error with lower variability over competing methods, and processes real world networks of up to 100,000 nodes within tens of seconds.