IT LG SI STNov 29, 2018

Testing Changes in Communities for the Stochastic Block Model

Aditya Gangrade, Praveen Venkatesh, Bobak Nazer, Venkatesh Saligrama

arXiv:1812.00769v33.33 citations

Originality Incremental advance

AI Analysis

This addresses community change detection in sparse networks, which is incremental but improves testing efficiency beyond recovery limits.

The paper tackles the problem of detecting changes in community memberships within stochastic block models, proposing efficient tests that work even when exact community recovery is impossible, achieving detection with SNR=O(1) for large changes compared to SNR=Θ(log n) for naive methods, and showing that for small changes, no algorithm outperforms naive estimation.

We propose and analyze the problems of \textit{community goodness-of-fit and two-sample testing} for stochastic block models (SBM), where changes arise due to modification in community memberships of nodes. Motivated by practical applications, we consider the challenging sparse regime, where expected node degrees are constant, and the inter-community mean degree ($b$) scales proportionally to intra-community mean degree ($a$). Prior work has sharply characterized partial or full community recovery in terms of a "signal-to-noise ratio" ($\mathrm{SNR}$) based on $a$ and $b$. For both problems, we propose computationally-efficient tests that can succeed far beyond the regime where recovery of community membership is even possible. Overall, for large changes, $s \gg \sqrt{n}$, we need only $\mathrm{SNR}= O(1)$ whereas a naïve test based on community recovery with $O(s)$ errors requires $\mathrm{SNR}= Θ(\log n)$. Conversely, in the small change regime, $s \ll \sqrt{n}$, via an information-theoretic lower bound, we show that, surprisingly, no algorithm can do better than the naïve algorithm that first estimates the community up to $O(s)$ errors and then detects changes. We validate these phenomena numerically on SBMs and on real-world datasets as well as Markov Random Fields where we only observe node data rather than the existence of links.

View on arXiv PDF

Similar