Learn from Heterophily: Heterophilous Information-enhanced Graph Neural Network
This addresses a key limitation in graph learning for applications with heterophilous data, though it appears incremental as it builds on existing GNN methods by adding heterophilous information.
The paper tackles the problem of Graph Neural Networks (GNNs) performing poorly under heterophily, where nodes with different labels are connected, by proposing HiGNN, which constructs a new graph structure using heterophilous semantic information to enhance connectivity between similar nodes, resulting in improved performance on node classification tasks across benchmark datasets.
Under circumstances of heterophily, where nodes with different labels tend to be connected based on semantic meanings, Graph Neural Networks (GNNs) often exhibit suboptimal performance. Current studies on graph heterophily mainly focus on aggregation calibration or neighbor extension and address the heterophily issue by utilizing node features or structural information to improve GNN representations. In this paper, we propose and demonstrate that the valuable semantic information inherent in heterophily can be utilized effectively in graph learning by investigating the distribution of neighbors for each individual node within the graph. The theoretical analysis is carried out to demonstrate the efficacy of the idea in enhancing graph learning. Based on this analysis, we propose HiGNN, an innovative approach that constructs an additional new graph structure, that integrates heterophilous information by leveraging node distribution to enhance connectivity between nodes that share similar semantic characteristics. We conduct empirical assessments on node classification tasks using both homophilous and heterophilous benchmark datasets and compare HiGNN to popular GNN baselines and SoTA methods, confirming the effectiveness in improving graph representations. In addition, by incorporating heterophilous information, we demonstrate a notable enhancement in existing GNN-based approaches, and the homophily degree across real-world datasets, thus affirming the efficacy of our approach.