Contextual Stochastic Block Models
This work addresses the problem of community detection in networks with covariates for researchers in statistical inference and network analysis, offering a foundational theoretical framework.
The paper provides the first information-theoretically tight analysis for inferring latent community structure from sparse graphs with high-dimensional node covariates, showing that combining both information sources is necessary for optimal inference.
We provide the first information theoretic tight analysis for inference of latent community structure given a sparse graph along with high dimensional node covariates, correlated with the same latent communities. Our work bridges recent theoretical breakthroughs in the detection of latent community structure without nodes covariates and a large body of empirical work using diverse heuristics for combining node covariates with graphs for inference. The tightness of our analysis implies in particular, the information theoretical necessity of combining the different sources of information. Our analysis holds for networks of large degrees as well as for a Gaussian version of the model.