Model Agnostic Local Explanations of Reject
This addresses the need for transparency in safety-critical ML systems by enabling explanations of reject decisions, which is an incremental advancement in explainable AI.
The paper tackles the problem of explaining why a machine learning system rejects uncertain samples, proposing a model-agnostic method that uses interpretable models and counterfactual explanations to provide local explanations for arbitrary reject options.
The application of machine learning based decision making systems in safety critical areas requires reliable high certainty predictions. Reject options are a common way of ensuring a sufficiently high certainty of predictions made by the system. While being able to reject uncertain samples is important, it is also of importance to be able to explain why a particular sample was rejected. However, explaining general reject options is still an open problem. We propose a model agnostic method for locally explaining arbitrary reject options by means of interpretable models and counterfactual explanations.