SC-MAD: Mixtures of Higher-order Networks for Data Augmentation
This work addresses data scarcity in simplicial complex-based machine learning, which is crucial for applications in complex systems with multiway interactions, though it is incremental as it adapts existing mixup techniques to a new domain.
The paper tackles the problem of limited data for learning on simplicial complexes, which model higher-order interactions, by proposing data augmentation methods using linear and nonlinear mixup mechanisms, including a convex clustering approach, to generate synthetic labeled samples and demonstrating improved classification performance on synthetic and real-world datasets.
The myriad complex systems with multiway interactions motivate the extension of graph-based pairwise connections to higher-order relations. In particular, the simplicial complex has inspired generalizations of graph neural networks (GNNs) to simplicial complex-based models. Learning on such systems requires large amounts of data, which can be expensive or impossible to obtain. We propose data augmentation of simplicial complexes through both linear and nonlinear mixup mechanisms that return mixtures of existing labeled samples. In addition to traditional pairwise mixup, we present a convex clustering mixup approach for a data-driven relationship among several simplicial complexes. We theoretically demonstrate that the resultant synthetic simplicial complexes interpolate among existing data with respect to homomorphism densities. Our method is demonstrated on both synthetic and real-world datasets for simplicial complex classification.