Learning Influence Functions from Incomplete Observations
This addresses the challenge of incomplete data in social network analysis, which is a common issue in real-world applications, but the approach is incremental as it builds on existing models like DLT and DIC.
The paper tackles the problem of learning influence functions from incomplete observations of node activations in social networks, establishing both proper and improper PAC learnability under missing data and demonstrating that their method can compensate for a large fraction of missing observations in experiments.
We study the problem of learning influence functions under incomplete observations of node activations. Incomplete observations are a major concern as most (online and real-world) social networks are not fully observable. We establish both proper and improper PAC learnability of influence functions under randomly missing observations. Proper PAC learnability under the Discrete-Time Linear Threshold (DLT) and Discrete-Time Independent Cascade (DIC) models is established by reducing incomplete observations to complete observations in a modified graph. Our improper PAC learnability result applies for the DLT and DIC models as well as the Continuous-Time Independent Cascade (CIC) model. It is based on a parametrization in terms of reachability features, and also gives rise to an efficient and practical heuristic. Experiments on synthetic and real-world datasets demonstrate the ability of our method to compensate even for a fairly large fraction of missing observations.