LG AIFeb 27, 2025

Accurate and Scalable Graph Neural Networks via Message Invariance

Zhihao Shi, Jie Wang, Zhiwei Zhuang, Xize Liang, Bin Li, Feng Wu

arXiv:2502.19693v111.45 citationsh-index: 10Has CodeICLR

Originality Incremental advance

AI Analysis

This addresses scalability issues for GNNs in real-world applications, enabling efficient training on large graphs, though it is an incremental improvement over existing mini-batch methods.

The paper tackles the computational inefficiency of message passing in graph neural networks (GNNs) on large-scale graphs by proposing a topological compensation (TOP) method that eliminates costly out-of-batch message passing, achieving order-of-magnitude speed improvements with minimal accuracy loss on graphs with millions of nodes and billions of edges.

Message passing-based graph neural networks (GNNs) have achieved great success in many real-world applications. For a sampled mini-batch of target nodes, the message passing process is divided into two parts: message passing between nodes within the batch (MP-IB) and message passing from nodes outside the batch to those within it (MP-OB). However, MP-OB recursively relies on higher-order out-of-batch neighbors, leading to an exponentially growing computational cost with respect to the number of layers. Due to the neighbor explosion, the whole message passing stores most nodes and edges on the GPU such that many GNNs are infeasible to large-scale graphs. To address this challenge, we propose an accurate and fast mini-batch approach for large graph transductive learning, namely topological compensation (TOP), which obtains the outputs of the whole message passing solely through MP-IB, without the costly MP-OB. The major pillar of TOP is a novel concept of message invariance, which defines message-invariant transformations to convert costly MP-OB into fast MP-IB. This ensures that the modified MP-IB has the same output as the whole message passing. Experiments demonstrate that TOP is significantly faster than existing mini-batch methods by order of magnitude on vast graphs (millions of nodes and billions of edges) with limited accuracy degradation.

View on arXiv PDF Code

Similar