MLSIDec 10, 2015

Scalable Modeling of Conversational-role based Self-presentation Characteristics in Large Online Forums

arXiv:1512.03443v1
Originality Incremental advance
AI Analysis

This addresses the challenge of analyzing complex user behavior patterns in massive online communities, though it represents an incremental methodological advancement.

The researchers tackled the problem of uncovering implicit sub-network structures in large online forums where users adopt different conversational roles across communities, developing a scalable algorithm that combines topic modeling with mixed membership stochastic block models. Their model outperformed existing methods in predicting user reply structures within threads across three large-scale datasets including StackOverFlow with 1.19 million users.

Online discussion forums are complex webs of overlapping subcommunities (macrolevel structure, across threads) in which users enact different roles depending on which subcommunity they are participating in within a particular time point (microlevel structure, within threads). This sub-network structure is implicit in massive collections of threads. To uncover this structure, we develop a scalable algorithm based on stochastic variational inference and leverage topic models (LDA) along with mixed membership stochastic block (MMSB) models. We evaluate our model on three large-scale datasets, Cancer-ThreadStarter (22K users and 14.4K threads), Cancer-NameMention(15.1K users and 12.4K threads) and StackOverFlow (1.19 million users and 4.55 million threads). Qualitatively, we demonstrate that our model can provide useful explanations of microlevel and macrolevel user presentation characteristics in different communities using the topics discovered from posts. Quantitatively, we show that our model does better than MMSB and LDA in predicting user reply structure within threads. In addition, we demonstrate via synthetic data experiments that the proposed active sub-network discovery model is stable and recovers the original parameters of the experimental setup with high probability.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes