LGMLSep 26, 2025

Neighborhood Sampling Does Not Learn the Same Graph Neural Network

arXiv:2509.22868v1h-index: 1
Originality Synthesis-oriented
AI Analysis

This work addresses a theoretical gap for researchers and practitioners in graph machine learning, but it is incremental as it builds on existing tools to analyze known sampling methods without proposing new solutions.

The paper tackles the problem of understanding the systemic behaviors of neighborhood sampling in graph neural networks (GNNs) by conducting a theoretical analysis using neural tangent kernels, revealing that different sampling approaches yield distinct posteriors with limited samples and converge only as sample size increases, with no approach dominating in terms of posterior covariance.

Neighborhood sampling is an important ingredient in the training of large-scale graph neural networks. It suppresses the exponential growth of the neighborhood size across network layers and maintains feasible memory consumption and time costs. While it becomes a standard implementation in practice, its systemic behaviors are less understood. We conduct a theoretical analysis by using the tool of neural tangent kernels, which characterize the (analogous) training dynamics of neural networks based on their infinitely wide counterparts -- Gaussian processes (GPs). We study several established neighborhood sampling approaches and the corresponding posterior GP. With limited samples, the posteriors are all different, although they converge to the same one as the sample size increases. Moreover, the posterior covariance, which lower-bounds the mean squared prediction error, is uncomparable, aligning with observations that no sampling approach dominates.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes