Calibrate and Debias Layer-wise Sampling for Graph Convolutional Networks
This work addresses efficiency and accuracy issues in GCN training for graph learning applications, representing an incremental improvement over existing layer-wise sampling methods.
The paper tackled the problem of suboptimal sampling probabilities and estimation biases in layer-wise sampling for Graph Convolutional Networks (GCNs), proposing a new sampling probability principle and a debiasing algorithm, which led to improved performance as shown by variance analyses and experiments on benchmarks.
Multiple sampling-based methods have been developed for approximating and accelerating node embedding aggregation in graph convolutional networks (GCNs) training. Among them, a layer-wise approach recursively performs importance sampling to select neighbors jointly for existing nodes in each layer. This paper revisits the approach from a matrix approximation perspective, and identifies two issues in the existing layer-wise sampling methods: suboptimal sampling probabilities and estimation biases induced by sampling without replacement. To address these issues, we accordingly propose two remedies: a new principle for constructing sampling probabilities and an efficient debiasing algorithm. The improvements are demonstrated by extensive analyses of estimation variance and experiments on common benchmarks. Code and algorithm implementations are publicly available at https://github.com/ychen-stat-ml/GCN-layer-wise-sampling .