LGSIJul 4, 2024

Generalizing Graph Transformers Across Diverse Graphs and Tasks via Pre-training

Tsinghua
arXiv:2407.03953v48 citationsh-index: 17
Originality Incremental advance
AI Analysis

This work addresses the problem of generalizing graph pre-training to large-scale industrial graphs for applications like online games, though it is incremental in building on masked autoencoder architectures.

The authors tackled the challenge of extending graph pre-trained models to web-scale graphs with billions of nodes while avoiding negative transfer, by introducing a scalable transformer-based framework called PGT that achieves state-of-the-art performance on a dataset with 111 million nodes and 1.6 billion edges, and scales to real-world graphs with over 540 million nodes and 12 billion edges.

Graph pre-training has been concentrated on graph-level tasks involving small graphs (e.g., molecular graphs) or learning node representations on a fixed graph. Extending graph pre-trained models to web-scale graphs with billions of nodes in industrial scenarios, while avoiding negative transfer across graphs or tasks, remains a challenge. We aim to develop a general graph pre-trained model with inductive ability that can make predictions for unseen new nodes and even new graphs. In this work, we introduce a scalable transformer-based graph pre-training framework called PGT (Pre-trained Graph Transformer). Based on the masked autoencoder architecture, we design two pre-training tasks: one for reconstructing node features and the other for reconstructing local structures. Unlike the original autoencoder architecture where the pre-trained decoder is discarded, we propose a novel strategy that utilizes the decoder for feature augmentation. Our framework, tested on the publicly available ogbn-papers100M dataset with 111 million nodes and 1.6 billion edges, achieves state-of-the-art performance, showcasing scalability and efficiency. We have deployed our framework on Tencent's online game data, confirming its capability to pre-train on real-world graphs with over 540 million nodes and 12 billion edges and to generalize effectively across diverse static and dynamic downstream tasks.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes