MLLGJan 26, 2018

Considerations When Learning Additive Explanations for Black-Box Models

arXiv:1801.08640v476 citations
Originality Incremental advance
AI Analysis

This work addresses the challenge of choosing trustworthy explanations for black-box models, which is crucial for practitioners in machine learning, though it is incremental in comparing existing methods.

The paper studied global additive explanations for non-additive black-box models, finding that distilled additive explanations were the most accurate among additive methods, but non-additive explanations like tree explanations were even more accurate, though a user study showed practitioners better leveraged additive explanations for tasks.

Many methods to explain black-box models, whether local or global, are additive. In this paper, we study global additive explanations for non-additive models, focusing on four explanation methods: partial dependence, Shapley explanations adapted to a global setting, distilled additive explanations, and gradient-based explanations. We show that different explanation methods characterize non-additive components in a black-box model's prediction function in different ways. We use the concepts of main and total effects to anchor additive explanations, and quantitatively evaluate additive and non-additive explanations. Even though distilled explanations are generally the most accurate additive explanations, non-additive explanations such as tree explanations that explicitly model non-additive components tend to be even more accurate. Despite this, our user study showed that machine learning practitioners were better able to leverage additive explanations for various tasks. These considerations should be taken into account when considering which explanation to trust and use to explain black-box models.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes