GOAt: Explaining Graph Neural Networks via Graph Output Attribution
This addresses the interpretability of GNNs for users in fields like network analysis, though it is incremental as it builds on existing explanation methods.
The paper tackles the problem of explaining Graph Neural Networks (GNNs) by introducing Graph Output Attribution (GOAt), a method that attributes graph outputs to input features, resulting in explanations that are faithful, discriminative, and stable, with experiments showing it outperforms state-of-the-art GNN explainers by a remarkable margin.
Understanding the decision-making process of Graph Neural Networks (GNNs) is crucial to their interpretability. Most existing methods for explaining GNNs typically rely on training auxiliary models, resulting in the explanations remain black-boxed. This paper introduces Graph Output Attribution (GOAt), a novel method to attribute graph outputs to input graph features, creating GNN explanations that are faithful, discriminative, as well as stable across similar samples. By expanding the GNN as a sum of scalar products involving node features, edge features and activation patterns, we propose an efficient analytical method to compute contribution of each node or edge feature to each scalar product and aggregate the contributions from all scalar products in the expansion form to derive the importance of each node and edge. Through extensive experiments on synthetic and real-world data, we show that our method not only outperforms various state-ofthe-art GNN explainers in terms of the commonly used fidelity metric, but also exhibits stronger discriminability, and stability by a remarkable margin.