LGAIMLJun 23, 2021

Synthetic Benchmarks for Scientific Research in Explainable Machine Learning

arXiv:2106.12543v488 citationsHas Code
Originality Synthesis-oriented
AI Analysis

This provides a tool for researchers in explainable AI to compare methods more easily, though it is incremental as it builds on existing evaluation needs.

The authors tackled the challenge of evaluating feature attribution methods in explainable machine learning by releasing XAI-Bench, a suite of synthetic datasets and a library for benchmarking, which enables efficient computation of ground-truth metrics like Shapley values.

As machine learning models grow more complex and their applications become more high-stakes, tools for explaining model predictions have become increasingly important. This has spurred a flurry of research in model explainability and has given rise to feature attribution methods such as LIME and SHAP. Despite their widespread use, evaluating and comparing different feature attribution methods remains challenging: evaluations ideally require human studies, and empirical evaluation metrics are often data-intensive or computationally prohibitive on real-world datasets. In this work, we address this issue by releasing XAI-Bench: a suite of synthetic datasets along with a library for benchmarking feature attribution algorithms. Unlike real-world datasets, synthetic datasets allow the efficient computation of conditional expected values that are needed to evaluate ground-truth Shapley values and other metrics. The synthetic datasets we release offer a wide variety of parameters that can be configured to simulate real-world data. We demonstrate the power of our library by benchmarking popular explainability techniques across several evaluation metrics and across a variety of settings. The versatility and efficiency of our library will help researchers bring their explainability methods from development to deployment. Our code is available at https://github.com/abacusai/xai-bench.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes