DCLGSep 14, 2020

Accelerating Graph Sampling for Graph Machine Learning using GPUs

arXiv:2009.06693v492 citations
Originality Incremental advance
AI Analysis

This addresses a bottleneck in graph ML training for researchers and practitioners, though it is incremental as it optimizes an existing process rather than introducing a new paradigm.

The paper tackles the problem of slow graph sampling in graph machine learning by presenting NextDoor, a GPU-accelerated system that uses transit-parallelism to improve efficiency, achieving orders of magnitude faster performance compared to existing systems.

Representation learning algorithms automatically learn the features of data. Several representation learning algorithms for graph data, such as DeepWalk, node2vec, and GraphSAGE, sample the graph to produce mini-batches that are suitable for training a DNN. However, sampling time can be a significant fraction of training time, and existing systems do not efficiently parallelize sampling. Sampling is an embarrassingly parallel problem and may appear to lend itself to GPU acceleration, but the irregularity of graphs makes it hard to use GPU resources effectively. This paper presents NextDoor, a system designed to effectively perform graph sampling on GPUs. NextDoor employs a new approach to graph sampling that we call transit-parallelism, which allows load balancing and caching of edges. NextDoor provides end-users with a high-level abstraction for writing a variety of graph sampling algorithms. We implement several graph sampling applications, and show that NextDoor runs them orders of magnitude faster than existing systems.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes