A Model Explanation System: Latest Updates and Extensions
This work offers incremental improvements to a specific explanation method for interpretable AI in domains like fraud detection.
The authors extended an existing model explanation system to provide explanations for individual predictions from black-box classifiers, demonstrating it on fraud detection in credit card transactions.
We propose a general model explanation system (MES) for "explaining" the output of black box classifiers. This paper describes extensions to Turner (2015), which is referred to frequently in the text. We use the motivating example of a classifier trained to detect fraud in a credit card transaction history. The key aspect is that we provide explanations applicable to a single prediction, rather than provide an interpretable set of parameters. We focus on explaining positive predictions (alerts). However, the presented methodology is symmetrically applicable to negative predictions.