DSPRMLJul 8, 2015

Multisection in the Stochastic Block Model using Semidefinite Programming

arXiv:1507.02323v160 citations
Originality Incremental advance
AI Analysis

This work addresses community detection in networks, an incremental advance in graph analysis with implications for social and biological network modeling.

The paper tackles the problem of exact recovery of hidden community structures in graphs using the Stochastic Block Model, showing that a semidefinite programming algorithm achieves optimal recovery thresholds for certain cluster sizes but not for larger ones.

We consider the problem of identifying underlying community-like structures in graphs. Towards this end we study the Stochastic Block Model (SBM) on $k$-clusters: a random model on $n=km$ vertices, partitioned in $k$ equal sized clusters, with edges sampled independently across clusters with probability $q$ and within clusters with probability $p$, $p>q$. The goal is to recover the initial "hidden" partition of $[n]$. We study semidefinite programming (SDP) based algorithms in this context. In the regime $p = \frac{α\log(m)}{m}$ and $q = \frac{β\log(m)}{m}$ we show that a certain natural SDP based algorithm solves the problem of {\em exact recovery} in the $k$-community SBM, with high probability, whenever $\sqrtα - \sqrtβ > \sqrt{1}$, as long as $k=o(\log n)$. This threshold is known to be the information theoretically optimal. We also study the case when $k=θ(\log(n))$. In this case however we achieve recovery guarantees that no longer match the optimal condition $\sqrtα - \sqrtβ > \sqrt{1}$, thus leaving achieving optimality for this range an open question.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes