Toward Scalable and Unified Example-based Explanation and Outlier Detection
This work addresses the need for interpretability and reliability in AI systems for critical applications, though it is incremental as it builds on existing prototype-based methods.
The paper tackles the problem of providing example-based explanations and outlier detection for neural networks in high-stakes decision-making, proposing a prototype-based network that achieves meaningful explanations and promising outlier detection results without compromising classification accuracy.
When neural networks are employed for high-stakes decision-making, it is desirable that they provide explanations for their prediction in order for us to understand the features that have contributed to the decision. At the same time, it is important to flag potential outliers for in-depth verification by domain experts. In this work we propose to unify two differing aspects of explainability with outlier detection. We argue for a broader adoption of prototype-based student networks capable of providing an example-based explanation for their prediction and at the same time identify regions of similarity between the predicted sample and the examples. The examples are real prototypical cases sampled from the training set via our novel iterative prototype replacement algorithm. Furthermore, we propose to use the prototype similarity scores for identifying outliers. We compare performances in terms of the classification, explanation quality, and outlier detection of our proposed network with other baselines. We show that our prototype-based networks beyond similarity kernels deliver meaningful explanations and promising outlier detection results without compromising classification accuracy.