LGOct 24, 2022

Layer-Neighbor Sampling -- Defusing Neighborhood Explosion in GNNs

Georgia Tech
arXiv:2210.13339v234 citationsh-index: 51
Originality Incremental advance
AI Analysis

This addresses scalability issues for researchers and practitioners training GNNs on large graphs, representing an incremental improvement over prior sampling techniques.

The paper tackles the challenge of neighborhood explosion in Graph Neural Networks (GNNs) during large-scale training by proposing a new sampling algorithm called LABOR, which reduces vertex sampling by up to 7 times without sacrificing quality and allows for up to 112 times larger batch sizes compared to existing methods.

Graph Neural Networks (GNNs) have received significant attention recently, but training them at a large scale remains a challenge. Mini-batch training coupled with sampling is used to alleviate this challenge. However, existing approaches either suffer from the neighborhood explosion phenomenon or have poor performance. To address these issues, we propose a new sampling algorithm called LAyer-neighBOR sampling (LABOR). It is designed to be a direct replacement for Neighbor Sampling (NS) with the same fanout hyperparameter while sampling up to 7 times fewer vertices, without sacrificing quality. By design, the variance of the estimator of each vertex matches NS from the point of view of a single vertex. Moreover, under the same vertex sampling budget constraints, LABOR converges faster than existing layer sampling approaches and can use up to 112 times larger batch sizes compared to NS.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes