LPNL: Scalable Link Prediction with Large Language Models
This work addresses the problem of scalable link prediction for researchers and practitioners dealing with large graphs, representing an incremental advancement in graph learning with LLMs.
The paper tackles the challenge of applying large language models (LLMs) to link prediction on large-scale heterogeneous graphs by introducing LPNL, a framework that uses novel prompts, sampling, and fine-tuning, and it demonstrates outperforming multiple advanced baselines in experiments.
Exploring the application of large language models (LLMs) to graph learning is a emerging endeavor. However, the vast amount of information inherent in large graphs poses significant challenges to this process. This work focuses on the link prediction task and introduces $\textbf{LPNL}$ (Link Prediction via Natural Language), a framework based on large language models designed for scalable link prediction on large-scale heterogeneous graphs. We design novel prompts for link prediction that articulate graph details in natural language. We propose a two-stage sampling pipeline to extract crucial information from the graphs, and a divide-and-conquer strategy to control the input tokens within predefined limits, addressing the challenge of overwhelming information. We fine-tune a T5 model based on our self-supervised learning designed for link prediction. Extensive experimental results demonstrate that LPNL outperforms multiple advanced baselines in link prediction tasks on large-scale graphs.