Missing Links in Public Email and Covert Networks: A Comparative Evaluation of Link Prediction, Hyperlink Prediction, and ERGM Estimation
For researchers analyzing partially observed networks, this paper clarifies which method (LP, HP, or ERGM) is most appropriate depending on whether the target is dyadic or higher-order structure.
This paper compares link prediction (LP), hyperlink prediction (HP), and ERGM estimation for inferring missing links in partially observed networks. LP remains strong for dyadic recovery, while HP (especially CHESHIRE) provides gains for higher-order group structure; ERGMs offer interpretable complement. The evaluation is comparative and reproducible.
We study missing-link inference in partially observed networks by systematically comparing dyadic link prediction (LP) with hyperlink prediction (HP) and an estimation-based ERGM comparator. LP serves as the primary baseline, using classical heuristics computed on the observed graph. HP extends this framework by scoring candidate higher-order structures (cliques) via lifted dyadic scores and via the CHEbyshev Spectral HyperlInk pREdictor (CHESHIRE). All methods are evaluated under a common masking protocol that removes dyadic evidence induced by held-out hyperlinks to ensure comparability. Across public email and covert-network datasets, LP remains strong for dyadic recovery, while HP -- particularly CHESHIRE -- provides gains when the inferential target is higher-order group structure. ERGMs offer an interpretable dependence-based complement through conditional tie probabilities. The contribution is a comparative, reproducible evaluation clarifying when LP, HP, and ERGM estimation are most appropriate under network missingness.