Algorithm and System Co-design for Efficient Subgraph-based Graph Representation Learning
This work addresses scalability problems for researchers and practitioners using SGRL in applications like link and motif prediction, though it is incremental as it builds on existing SGRL methods.
The paper tackles the scalability issue in subgraph-based graph representation learning (SGRL) by proposing SUREL, a framework that co-designs the learning algorithm and system support, achieving a 10x speed-up over SGRL baselines and a 50% prediction accuracy improvement compared to canonical GNNs.
Subgraph-based graph representation learning (SGRL) has been recently proposed to deal with some fundamental challenges encountered by canonical graph neural networks (GNNs), and has demonstrated advantages in many important data science applications such as link, relation and motif prediction. However, current SGRL approaches suffer from scalability issues since they require extracting subgraphs for each training or test query. Recent solutions that scale up canonical GNNs may not apply to SGRL. Here, we propose a novel framework SUREL for scalable SGRL by co-designing the learning algorithm and its system support. SUREL adopts walk-based decomposition of subgraphs and reuses the walks to form subgraphs, which substantially reduces the redundancy of subgraph extraction and supports parallel computation. Experiments over six homogeneous, heterogeneous and higher-order graphs with millions of nodes and edges demonstrate the effectiveness and scalability of SUREL. In particular, compared to SGRL baselines, SUREL achieves 10$\times$ speed-up with comparable or even better prediction performance; while compared to canonical GNNs, SUREL achieves 50% prediction accuracy improvement.