LGAIFeb 5, 2021

CF-GNNExplainer: Counterfactual Explanations for Graph Neural Networks

arXiv:2102.03322v4190 citations
AI Analysis

This work addresses the problem of providing counterfactual explanations for GNN predictions, which is crucial for understanding model behavior and building trust in GNN applications for practitioners.

This paper introduces CF-GNNExplainer, a method for generating counterfactual explanations for Graph Neural Networks (GNNs) by identifying minimal perturbations to the input graph that change the prediction. The method can generate counterfactual explanations for most instances across three datasets by removing fewer than 3 edges on average, achieving at least 94% accuracy.

Given the increasing promise of graph neural networks (GNNs) in real-world applications, several methods have been developed for explaining their predictions. Existing methods for interpreting predictions from GNNs have primarily focused on generating subgraphs that are especially relevant for a particular prediction. However, such methods are not counterfactual (CF) in nature: given a prediction, we want to understand how the prediction can be changed in order to achieve an alternative outcome. In this work, we propose a method for generating CF explanations for GNNs: the minimal perturbation to the input (graph) data such that the prediction changes. Using only edge deletions, we find that our method, CF-GNNExplainer, can generate CF explanations for the majority of instances across three widely used datasets for GNN explanations, while removing less than 3 edges on average, with at least 94\% accuracy. This indicates that CF-GNNExplainer primarily removes edges that are crucial for the original predictions, resulting in minimal CF explanations.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes