A GAN Approach for Node Embedding in Heterogeneous Graphs Using Subgraph Sampling
This addresses the problem of biased inference due to class imbalance in heterogeneous graphs for researchers and practitioners in graph machine learning, representing an incremental advance over existing methods.
The paper tackles class imbalance in heterogeneous graph neural networks by proposing a GAN-based framework that generates synthetic nodes and edges to enhance classification of underrepresented node classes, achieving notable improvements in F-score and AUC-PRC scores on real-world datasets.
Graph neural networks (GNNs) face significant challenges with class imbalance, leading to biased inference results. To address this issue in heterogeneous graphs, we propose a novel framework that combines Graph Neural Network (GNN) and Generative Adversarial Network (GAN) to enhance classification for underrepresented node classes. The framework incorporates an advanced edge generation and selection module, enabling the simultaneous creation of synthetic nodes and edges through adversarial learning. Unlike previous methods, which predominantly focus on homogeneous graphs due to the difficulty of representing heterogeneous graph structures in matrix form, this approach is specifically designed for heterogeneous data. Existing solutions often rely on pre-trained models to incorporate synthetic nodes, which can lead to optimization inconsistencies and mismatches in data representation. Our framework avoids these pitfalls by generating data that aligns closely with the inherent graph topology and attributes, ensuring a more cohesive integration. Evaluations on multiple real-world datasets demonstrate the method's superiority over baseline models, particularly in tasks focused on identifying minority node classes, with notable improvements in performance metrics such as F-score and AUC-PRC score. These findings highlight the potential of this approach for addressing critical challenges in the field.