LGAIMay 21, 2021

Probabilistic Sufficient Explanations

arXiv:2105.10118v133 citationsHas Code
Originality Incremental advance
AI Analysis

This addresses the need for reliable and interpretable explanations in machine learning, particularly for users of complex models, though it is incremental as it builds on existing explanation frameworks.

The paper tackles the problem of explaining black-box classifier decisions by introducing probabilistic sufficient explanations, which select minimal feature subsets that provide strong probabilistic guarantees of consistent model behavior, and demonstrates its effectiveness with advantages over existing methods like Anchors and logical explanations.

Understanding the behavior of learned classifiers is an important task, and various black-box explanations, logical reasoning approaches, and model-specific methods have been proposed. In this paper, we introduce probabilistic sufficient explanations, which formulate explaining an instance of classification as choosing the "simplest" subset of features such that only observing those features is "sufficient" to explain the classification. That is, sufficient to give us strong probabilistic guarantees that the model will behave similarly when all features are observed under the data distribution. In addition, we leverage tractable probabilistic reasoning tools such as probabilistic circuits and expected predictions to design a scalable algorithm for finding the desired explanations while keeping the guarantees intact. Our experiments demonstrate the effectiveness of our algorithm in finding sufficient explanations, and showcase its advantages compared to Anchors and logical explanations.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes