SI AINov 15, 2023

Predicting Scientific Impact Through Diffusion, Conformity, and Contribution Disentanglement

Zhikai Xue, Guoxiu He, Zhuoren Jiang, Sichen Gu, Yangyang Kang, Star Zhao, Wei Lu

arXiv:2311.09262v42.34 citationsh-index: 8Has Code

Originality Incremental advance

AI Analysis

This work addresses the challenge of accurately forecasting academic paper citations for researchers and institutions, though it appears incremental as it builds on existing dynamic graph methods with a novel disentanglement approach.

The paper tackles the problem of predicting scientific impact by disentangling citation influences into diffusion, conformity, and contribution factors, proposing a model (DPPDCC) that outperforms baselines across various publication times in experiments on three datasets.

The scientific impact of academic papers is influenced by intricate factors such as dynamic popularity and inherent contribution. Existing models typically rely on static graphs for citation count estimation, failing to differentiate among its sources. In contrast, we propose distinguishing effects derived from various factors and predicting citation increments as estimated potential impacts within the dynamic context. In this research, we introduce a novel model, DPPDCC, which Disentangles the Potential impacts of Papers into Diffusion, Conformity, and Contribution values. It encodes temporal and structural features within dynamic heterogeneous graphs derived from the citation networks and applies various auxiliary tasks for disentanglement. By emphasizing comparative and co-cited/citing information and aggregating snapshots evolutionarily, DPPDCC captures knowledge flow within the citation network. Afterwards, popularity is outlined by contrasting augmented graphs to extract the essence of citation diffusion and predicting citation accumulation bins for quantitative conformity modeling. Orthogonal constraints ensure distinct modeling of each perspective, preserving the contribution value. To gauge generalization across publication times and replicate the realistic dynamic context, we partition data based on specific time points and retain all samples without strict filtering. Extensive experiments on three datasets validate DPPDCC's superiority over baselines for papers published previously, freshly, and immediately, with further analyses confirming its robustness. Our codes and supplementary materials can be found at https://github.com/ECNU-Text-Computing/DPPDCC.

View on arXiv PDF Code

Similar