LGAIMEJun 23, 2022

Explanatory causal effects for model agnostic explanations

arXiv:2206.11529v11 citationsh-index: 49
Originality Incremental advance
AI Analysis

This work addresses the need for interpretable, model-agnostic explanations in machine learning, though it appears incremental by building on existing causal effect concepts.

The paper tackles the problem of estimating feature contributions to individual predictions and overall model behavior by defining an explanatory causal effect based on a hypothetical ideal experiment, enabling transparent, data-driven explanations without requiring a known causal graph, and demonstrates its effectiveness on real-world datasets.

This paper studies the problem of estimating the contributions of features to the prediction of a specific instance by a machine learning model and the overall contribution of a feature to the model. The causal effect of a feature (variable) on the predicted outcome reflects the contribution of the feature to a prediction very well. A challenge is that most existing causal effects cannot be estimated from data without a known causal graph. In this paper, we define an explanatory causal effect based on a hypothetical ideal experiment. The definition brings several benefits to model agnostic explanations. First, explanations are transparent and have causal meanings. Second, the explanatory causal effect estimation can be data driven. Third, the causal effects provide both a local explanation for a specific prediction and a global explanation showing the overall importance of a feature in a predictive model. We further propose a method using individual and combined variables based on explanatory causal effects for explanations. We show the definition and the method work with experiments on some real-world data sets.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes