LG AIOct 7, 2025

Are Heterogeneous Graph Neural Networks Truly Effective? A Causal Perspective

arXiv:2510.05750v14.1h-index: 2Has Code

Originality Incremental advance

AI Analysis

This work addresses a fundamental question in graph machine learning about the effectiveness of HGNNs, providing causal insights that could guide future model design and evaluation.

The study investigated whether heterogeneous graph neural networks (HGNNs) are truly effective by analyzing them from model architecture and heterogeneous information perspectives, finding that model architecture has no causal effect on performance while heterogeneous information improves it by increasing homophily and local-global distribution discrepancy.

Graph neural networks (GNNs) have achieved remarkable success in node classification. Building on this progress, heterogeneous graph neural networks (HGNNs) integrate relation types and node and edge semantics to leverage heterogeneous information. Causal analysis for HGNNs is advancing rapidly, aiming to separate genuine causal effects from spurious correlations. However, whether HGNNs are intrinsically effective remains underexamined, and most studies implicitly assume rather than establish this effectiveness. In this work, we examine HGNNs from two perspectives: model architecture and heterogeneous information. We conduct a systematic reproduction across 21 datasets and 20 baselines, complemented by comprehensive hyperparameter retuning. To further disentangle the source of performance gains, we develop a causal effect estimation framework that constructs and evaluates candidate factors under standard assumptions through factual and counterfactual analyses, with robustness validated via minimal sufficient adjustment sets, cross-method consistency checks, and sensitivity analyses. Our results lead to two conclusions. First, model architecture and complexity have no causal effect on performance. Second, heterogeneous information exerts a positive causal effect by increasing homophily and local-global distribution discrepancy, which makes node classes more distinguishable. The implementation is publicly available at https://github.com/YXNTU/CausalHGNN.

View on arXiv PDF Code

Similar