Metamorphic Testing with the Rashomon Set: Explanation Faithfulness in Machine Learning
For practitioners needing reliable explanations from ML models, this provides a model-agnostic method to evaluate faithfulness, but the results are preliminary and incremental.
The paper addresses the Rashomon effect in explainable ML, where multiple models with similar performance yield different explanations. It proposes a metamorphic testing framework to assess explanation faithfulness without ground truth, applied to two datasets with SHAP and LIME, offering a practical tool for selecting trustworthy models.
Multiple machine learning models can achieve near-equivalent predictive performance on the same task, yet provide divergent feature-based explanations. This is called the Rashomon effect of (explainable) machine learning, and it raises the question of which explanations, if any, are trustworthy. We propose a framework based on metamorphic testing that assesses explanation faithfulness without requiring ground-truth labels by exploring attributed feature importance from post-hoc explanation methods. Five metamorphic relations formalize expected consistency properties between model behavior and feature attributions. We apply this general framework to two tabular regression datasets and two post-hoc explainers (SHAP and LIME) to demonstrate the approach. The framework offers a practical, model-agnostic tool for selecting accurate models with reliable and trustworthy explanations.