Stochastic Blockmodels with Edge Information
This work addresses the problem of extracting richer insights from interaction networks for researchers and practitioners in network analysis, representing an incremental advance by extending existing models with topic modeling.
The paper tackles the limitation of traditional stochastic blockmodels that only use binary network data by proposing the Topic Blockmodel, which incorporates edge information like communication volume and content to improve community detection and enable tasks such as predicting email recipients and inferring email content.
Stochastic blockmodels allow us to represent networks in terms of a latent community structure, often yielding intuitions about the underlying social structure. Typically, this structure is inferred based only on a binary network representing the presence or absence of interactions between nodes, which limits the amount of information that can be extracted from the data. In practice, many interaction networks contain much more information about the relationship between two nodes. For example, in an email network, the volume of communication between two users and the content of that communication can give us information about both the strength and the nature of their relationship. In this paper, we propose the Topic Blockmodel, a stochastic blockmodel that uses a count-based topic model to capture the interaction modalities within and between latent communities. By explicitly incorporating information sent between nodes in our network representation, we are able to address questions of interest in real-world situations, such as predicting recipients for an email message or inferring the content of an unopened email. Further, by considering topics associated with a pair of communities, we are better able to interpret the nature of each community and the manner in which it interacts with other communities.