Explaining Reject Options of Learning Vector Quantization Classifiers
This work tackles the problem of interpretability in AI for users needing to understand model rejections, but it is incremental as it builds on existing explainable AI methods.
The paper addresses the lack of explanations for why machine learning models reject inputs, proposing counterfactual explanations for reject options in prototype-based classifiers like learning vector quantization, and investigates efficient computation methods for these explanations.
While machine learning models are usually assumed to always output a prediction, there also exist extensions in the form of reject options which allow the model to reject inputs where only a prediction with an unacceptably low certainty would be possible. With the ongoing rise of eXplainable AI, a lot of methods for explaining model predictions have been developed. However, understanding why a given input was rejected, instead of being classified by the model, is also of interest. Surprisingly, explanations of rejects have not been considered so far. We propose to use counterfactual explanations for explaining rejects and investigate how to efficiently compute counterfactual explanations of different reject options for an important class of models, namely prototype-based classifiers such as learning vector quantization models.