LGMLJun 16, 2020

High Dimensional Model Explanations: an Axiomatic Approach

arXiv:2006.08969v223 citations
AI Analysis

This work addresses the need for interpretability in critical decision-making domains by providing a formal, axiomatic approach to high-dimensional model explanations, which is incremental as it builds on existing explanation methods to handle feature interactions.

The paper tackles the problem of explaining black-box machine learning models by capturing joint effects of feature subsets, proposing a novel high-dimensional explanation method based on a generalized Banzhaf index that uniquely satisfies natural axioms and serves as an optimal local approximation.

Complex black-box machine learning models are regularly used in critical decision-making domains. This has given rise to several calls for algorithmic explainability. Many explanation algorithms proposed in literature assign importance to each feature individually. However, such explanations fail to capture the joint effects of sets of features. Indeed, few works so far formally analyze high-dimensional model explanations. In this paper, we propose a novel high dimension model explanation method that captures the joint effect of feature subsets. We propose a new axiomatization for a generalization of the Banzhaf index; our method can also be thought of as an approximation of a black-box model by a higher-order polynomial. In other words, this work justifies the use of the generalized Banzhaf index as a model explanation by showing that it uniquely satisfies a set of natural desiderata and that it is the optimal local approximation of a black-box model. Our empirical evaluation of our measure highlights how it manages to capture desirable behavior, whereas other measures that do not satisfy our axioms behave in an unpredictable manner.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes