Predicting Diffusion Reach Probabilities via Representation Learning on Social Networks
This work addresses a domain-specific problem for social network analysis, offering an incremental improvement in predicting diffusion probabilities under data constraints.
The paper tackled the problem of estimating diffusion reach probabilities on social networks with limited cascade data and partial network structure, achieving superior results compared to baseline methods when only a small portion of cascades is available.
Diffusion reach probability between two nodes on a network is defined as the probability of a cascade originating from one node reaching to another node. An infinite number of cascades would enable calculation of true diffusion reach probabilities between any two nodes. However, there exists only a finite number of cascades and one usually has access only to a small portion of all available cascades. In this work, we addressed the problem of estimating diffusion reach probabilities given only a limited number of cascades and partial information about underlying network structure. Our proposed strategy employs node representation learning to generate and feed node embeddings into machine learning algorithms to create models that predict diffusion reach probabilities. We provide experimental analysis using synthetically generated cascades on two real-world social networks. Results show that proposed method is superior to using values calculated from available cascades when the portion of cascades is small.