Efficient computation of contrastive explanations
This work addresses the need for transparent and intuitive explanations in deployed ML systems, particularly for lay people, but it is incremental as it focuses on specific model types rather than black-box technologies.
The paper tackles the problem of generating contrastive explanations for machine learning models by proposing a mathematical formalization and a 2-phase algorithm to efficiently compute plausible pertinent positives, achieving computational feasibility with guarantees on uniqueness and optimality.
With the increasing deployment of machine learning systems in practice, transparency and explainability have become serious issues. Contrastive explanations are considered to be useful and intuitive, in particular when it comes to explaining decisions to lay people, since they mimic the way in which humans explain. Yet, so far, comparably little research has addressed computationally feasible technologies, which allow guarantees on uniqueness and optimality of the explanation and which enable an easy incorporation of additional constraints. Here, we will focus on specific types of models rather than black-box technologies. We study the relation of contrastive and counterfactual explanations and propose mathematical formalizations as well as a 2-phase algorithm for efficiently computing (plausible) pertinent positives of many standard machine learning models.