Alike Parts: A Feature-Informed Approach to Local and Global Prototype Explanations
For users of black-box classifiers, this work provides more granular, feature-informed prototype explanations without sacrificing model fidelity.
The paper introduces a framework that integrates feature importance into prototype-based explanations to improve interpretability, showing that augmenting global prototype selection with feature diversity maintains or increases prediction fidelity across six benchmark datasets.
Prototype-based explanations offer an intuitive, example-based approach to support the interpretability of machine learning black box classifiers but often lack feature-level granularity. We introduce a framework that integrates feature importance at two levels to address this gap. First, for local explanations, we propose \textit{alike parts}: a method that uses feature importance scores to highlight the most relevant, shared feature subsets between a classified instance and its nearest prototype, guiding user attention. Second, we augment the global prototype selection objective function with a feature importance term to actively promote diversity in the feature attributions of the selected prototypes. Experiments on six benchmark datasets show that this augmented selection process maintains or, in some cases, increases the prediction fidelity of the surrogate model, suggesting that feature diversity does not compromise model fidelity.