Detecting Social Influence in Event Cascades by Comparing Discriminative Rankers
This work addresses the problem of distinguishing social influence from confounds like homophily for researchers and practitioners analyzing event cascades, representing an incremental improvement over existing methods.
The paper tackled the challenge of detecting social influence in event cascades from observational data, proposing a discriminative ranking method that correctly identifies influence in synthetic data and applies to real-world datasets like U.S. House legislation co-sponsorship and Twitter rumors, improving prediction accuracy for cascade trajectories.
The global dynamics of event cascades are often governed by the local dynamics of peer influence. However, detecting social influence from observational data is challenging due to confounds like homophily and practical issues like missing data. We propose a simple discriminative method to detect influence from observational data. The core of the approach is to train a ranking algorithm to predict the source of the next event in a cascade, and compare its out-of-sample accuracy against a competitive baseline which lacks access to features corresponding to social influence. We analyze synthetically generated data to show that this method correctly identifies influence in the presence of confounds, and is robust to both missing data and misspecification --- unlike well-known alternatives. We apply the method to two real-world datasets: (1) the co-sponsorship of legislation in the U.S. House of Representatives on a social network of shared campaign donors; (2) rumors about the Higgs boson discovery on a follower network of $10^5$ Twitter accounts. Our model identifies the role of social influence in these scenarios and uses it to make more accurate predictions about the future trajectory of cascades.