GraphMAD: Graph Mixup for Data Augmentation using Data-Driven Convex Clustering
This work addresses the problem of limited training data for graph classification tasks, offering a domain-specific solution that is incremental in advancing graph mixup techniques.
The paper tackled the challenge of data augmentation for graph data by developing a novel nonlinear mixup mechanism that projects graphs onto a latent space using graphons and applies convex clustering, resulting in significant improvements in graph classification on benchmark datasets.
We develop a novel data-driven nonlinear mixup mechanism for graph data augmentation and present different mixup functions for sample pairs and their labels. Mixup is a data augmentation method to create new training data by linearly interpolating between pairs of data samples and their labels. Mixup of graph data is challenging since the interpolation between graphs of potentially different sizes is an ill-posed operation. Hence, a promising approach for graph mixup is to first project the graphs onto a common latent feature space and then explore linear and nonlinear mixup strategies in this latent space. In this context, we propose to (i) project graphs onto the latent space of continuous random graph models known as graphons, (ii) leverage convex clustering in this latent space to generate nonlinear data-driven mixup functions, and (iii) investigate the use of different mixup functions for labels and data samples. We evaluate our graph data augmentation performance on benchmark datasets and demonstrate that nonlinear data-driven mixup functions can significantly improve graph classification.