Data Augmentation View on Graph Convolutional Network and the Proposal of Monte Carlo Graph Learning
This work offers a transparent alternative to GCNs for graph learning, though it is incremental as it builds on existing methods with limited practical gains in noisy settings.
The authors introduced a new data augmentation perspective on graph convolutional networks (GCNs) and proposed Monte Carlo Graph Learning (MCGL), a paradigm that propagates labels through graphs to expand training sets for traditional classifiers. They found MCGL outperforms GCNs on clean synthetic graphs but is less tolerant to noise on four real-world datasets, and used it to argue that graph structure noise, not over-smoothing, causes performance degradation in deep GCNs.
Today, there are two major understandings for graph convolutional networks, i.e., in the spectral and spatial domain. But both lack transparency. In this work, we introduce a new understanding for it -- data augmentation, which is more transparent than the previous understandings. Inspired by it, we propose a new graph learning paradigm -- Monte Carlo Graph Learning (MCGL). The core idea of MCGL contains: (1) Data augmentation: propagate the labels of the training set through the graph structure and expand the training set; (2) Model training: use the expanded training set to train traditional classifiers. We use synthetic datasets to compare the strengths of MCGL and graph convolutional operation on clean graphs. In addition, we show that MCGL's tolerance to graph structure noise is weaker than GCN on noisy graphs (four real-world datasets). Moreover, inspired by MCGL, we re-analyze the reasons why the performance of GCN becomes worse when deepened too much: rather than the mainstream view of over-smoothing, we argue that the main reason is the graph structure noise, and experimentally verify our view. The code is available at https://github.com/DongHande/MCGL.