LGDec 16, 2022

TopoImb: Toward Topology-level Imbalance in Learning from Graphs

Tianxiang Zhao, Dongsheng Luo, Xiang Zhang, Suhang Wang

arXiv:2212.08689v114.115 citationsh-index: 48

Originality Incremental advance

AI Analysis

This addresses a specific data imbalance issue in graph neural networks for researchers and practitioners, though it is incremental as it builds on existing class-level imbalance methods.

The paper tackles the problem of topology-level imbalance in graph learning, where majority topology groups dominate training, and proposes TopoImb, a framework that improves classification performance by automatically identifying topology groups and modulating training to address under-representation, achieving empirical gains in node-level and graph-level tasks.

Graph serves as a powerful tool for modeling data that has an underlying structure in non-Euclidean space, by encoding relations as edges and entities as nodes. Despite developments in learning from graph-structured data over the years, one obstacle persists: graph imbalance. Although several attempts have been made to target this problem, they are limited to considering only class-level imbalance. In this work, we argue that for graphs, the imbalance is likely to exist at the sub-class topology group level. Due to the flexibility of topology structures, graphs could be highly diverse, and learning a generalizable classification boundary would be difficult. Therefore, several majority topology groups may dominate the learning process, rendering others under-represented. To address this problem, we propose a new framework {\method} and design (1 a topology extractor, which automatically identifies the topology group for each instance with explicit memory cells, (2 a training modulator, which modulates the learning process of the target GNN model to prevent the case of topology-group-wise under-representation. {\method} can be used as a key component in GNN models to improve their performances under the data imbalance setting. Analyses on both topology-level imbalance and the proposed {\method} are provided theoretically, and we empirically verify its effectiveness with both node-level and graph-level classification as the target tasks.

View on arXiv PDF

Similar