LG AIJun 13, 2024

Towards Understanding Link Predictor Generalizability Under Distribution Shifts

Jay Revolinsky, Harry Shomer, Jiliang Tang

arXiv:2406.08788v3Has Code

Originality Incremental advance

AI Analysis

This addresses a critical gap in graph learning for link-level tasks under distribution shifts, which is incremental but important for real-world applications.

The paper tackles the problem of link prediction models failing to generalize under distribution shifts, introducing a novel splitting strategy called LPShift that induces controlled shifts and showing that it drastically changes model performance on 16 variants.

State-of-the-art link prediction (LP) models demonstrate impressive benchmark results. However, popular benchmark datasets often assume that training, validation, and testing samples are representative of the overall dataset distribution. In real-world situations, this assumption is often incorrect; uncontrolled factors lead new dataset samples to come from a different distribution than training samples. Additionally, the majority of recent work with graph dataset shift focuses on node- and graph-level tasks, largely ignoring link-level tasks. To bridge this gap, we introduce a novel splitting strategy, known as LPShift, which utilizes structural properties to induce a controlled distribution shift. We verify LPShift's effect through empirical evaluation of SOTA LP models on 16 LPShift variants of original dataset splits, with results indicating drastic changes to model performance. Additional experiments demonstrate graph structure has a strong influence on the success of current generalization methods. Source Code Available Here: https://github.com/revolins/LPShift

View on arXiv PDF Code

Similar