Metropolis-Hastings Data Augmentation for Graph Neural Networks
This addresses a domain-specific problem for researchers and practitioners in graph-based machine learning, offering an incremental improvement in data augmentation techniques for semi-supervised learning on graphs.
The paper tackles the problem of weak generalization in Graph Neural Networks (GNNs) due to sparsely labeled data by proposing Metropolis-Hastings Data Augmentation (MH-Aug), which generates augmented graphs from a target distribution to improve performance, with experiments showing significant gains.
Graph Neural Networks (GNNs) often suffer from weak-generalization due to sparsely labeled data despite their promising results on various graph-based tasks. Data augmentation is a prevalent remedy to improve the generalization ability of models in many domains. However, due to the non-Euclidean nature of data space and the dependencies between samples, designing effective augmentation on graphs is challenging. In this paper, we propose a novel framework Metropolis-Hastings Data Augmentation (MH-Aug) that draws augmented graphs from an explicit target distribution for semi-supervised learning. MH-Aug produces a sequence of augmented graphs from the target distribution enables flexible control of the strength and diversity of augmentation. Since the direct sampling from the complex target distribution is challenging, we adopt the Metropolis-Hastings algorithm to obtain the augmented samples. We also propose a simple and effective semi-supervised learning strategy with generated samples from MH-Aug. Our extensive experiments demonstrate that MH-Aug can generate a sequence of samples according to the target distribution to significantly improve the performance of GNNs.