LGAIJun 28, 2022

BAGEL: A Benchmark for Assessing Graph Neural Network Explanations

arXiv:2206.13983v118 citationsh-index: 27Has Code
Originality Synthesis-oriented
AI Analysis

This provides a standardized evaluation framework for researchers and practitioners working on GNN interpretability, though it is incremental as it consolidates existing metrics into a benchmark.

The authors tackled the lack of a standardized benchmark for evaluating interpretability approaches in graph neural networks (GNNs) by proposing BAGEL, a benchmark that includes four evaluation regimes and covers diverse graph datasets, resulting in an extensive empirical study on four GNN models and nine explanation methods.

The problem of interpreting the decisions of machine learning is a well-researched and important. We are interested in a specific type of machine learning model that deals with graph data called graph neural networks. Evaluating interpretability approaches for graph neural networks (GNN) specifically are known to be challenging due to the lack of a commonly accepted benchmark. Given a GNN model, several interpretability approaches exist to explain GNN models with diverse (sometimes conflicting) evaluation methodologies. In this paper, we propose a benchmark for evaluating the explainability approaches for GNNs called Bagel. In Bagel, we firstly propose four diverse GNN explanation evaluation regimes -- 1) faithfulness, 2) sparsity, 3) correctness. and 4) plausibility. We reconcile multiple evaluation metrics in the existing literature and cover diverse notions for a holistic evaluation. Our graph datasets range from citation networks, document graphs, to graphs from molecules and proteins. We conduct an extensive empirical study on four GNN models and nine post-hoc explanation approaches for node and graph classification tasks. We open both the benchmarks and reference implementations and make them available at https://github.com/Mandeep-Rathee/Bagel-benchmark.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes