Stochastic Training of Graph Convolutional Networks with Variance Reduction
This addresses scalability issues for researchers and practitioners using GCNs on large graphs, though it is incremental as it builds on existing neighbor sampling approaches.
The paper tackled the problem of exponential receptive field growth in Graph Convolutional Networks (GCNs) by developing control variate-based algorithms that allow sampling arbitrarily small neighbor sizes, resulting in runtime reduced to one-seventh of previous methods on a large Reddit dataset.
Graph convolutional networks (GCNs) are powerful deep neural networks for graph-structured data. However, GCN computes the representation of a node recursively from its neighbors, making the receptive field size grow exponentially with the number of layers. Previous attempts on reducing the receptive field size by subsampling neighbors do not have a convergence guarantee, and their receptive field size per node is still in the order of hundreds. In this paper, we develop control variate based algorithms which allow sampling an arbitrarily small neighbor size. Furthermore, we prove new theoretical guarantee for our algorithms to converge to a local optimum of GCN. Empirical results show that our algorithms enjoy a similar convergence with the exact algorithm using only two neighbors per node. The runtime of our algorithms on a large Reddit dataset is only one seventh of previous neighbor sampling algorithms.