CR LGJun 5, 2024

Graph Neural Network Explanations are Fragile

Jiate Li, Meng Pang, Yun Dong, Jinyuan Jia, Binghui Wang

arXiv:2406.03193v117.722 citationsHas Code

Originality Highly original

AI Analysis

This work highlights a critical security flaw in explainable AI for graph-based systems, potentially undermining trust in GNN applications, and is incremental as it builds on existing GNN explainer research.

The paper investigates the vulnerability of Graph Neural Network (GNN) explainers to adversarial attacks, finding that slight perturbations to graph structure can drastically alter explanations while maintaining correct model predictions, with methods achieving high attack success rates across various explainers.

Explainable Graph Neural Network (GNN) has emerged recently to foster the trust of using GNNs. Existing GNN explainers are developed from various perspectives to enhance the explanation performance. We take the first step to study GNN explainers under adversarial attack--We found that an adversary slightly perturbing graph structure can ensure GNN model makes correct predictions, but the GNN explainer yields a drastically different explanation on the perturbed graph. Specifically, we first formulate the attack problem under a practical threat model (i.e., the adversary has limited knowledge about the GNN explainer and a restricted perturbation budget). We then design two methods (i.e., one is loss-based and the other is deduction-based) to realize the attack. We evaluate our attacks on various GNN explainers and the results show these explainers are fragile.

View on arXiv PDF Code

Similar