GNNs Meet Sequence Models Along the Shortest-Path: an Expressive Method for Link Prediction
This work solves the link prediction problem for graph-based applications by providing a more expressive and efficient method, though it is incremental as it builds on existing GNN and sequence modeling techniques.
The paper tackles the problem of link prediction in graphs by addressing GNNs' limitations in capturing link-specific structural patterns, introducing SP4LP which combines GNN node encodings with sequence modeling over shortest paths, achieving state-of-the-art performance across benchmarks and proving greater expressiveness than standard methods.
Graph Neural Networks (GNNs) often struggle to capture the link-specific structural patterns crucial for accurate link prediction, as their node-centric message-passing schemes overlook the subgraph structures connecting a pair of nodes. Existing methods to inject such structural context either incur high computational cost or rely on simplistic heuristics (e.g., common neighbor counts) that fail to model multi-hop dependencies. We introduce SP4LP (Shortest Path for Link Prediction), a novel framework that combines GNN-based node encodings with sequence modeling over shortest paths. Specifically, SP4LP first applies a GNN to compute representations for all nodes, then extracts the shortest path between each candidate node pair and processes the resulting sequence of node embeddings using a sequence model. This design enables SP4LP to capture expressive multi-hop relational patterns with computational efficiency. Empirically, SP4LP achieves state-of-the-art performance across link prediction benchmarks. Theoretically, we prove that SP4LP is strictly more expressive than standard message-passing GNNs and several state-of-the-art structural features methods, establishing it as a general and principled approach for link prediction in graphs.