Can we Agree? On the Rashōmon Effect and the Reliability of Post-Hoc Explainable AI
This addresses the challenge of ensuring trustworthy post-hoc explainable AI for practitioners, though it is incremental as it focuses on sample size effects with existing methods.
The study tackled the problem of unreliable explanations from machine learning models due to the Rashōmon effect, finding that explanations from models using SHAP converged as sample size increased, with high variability below 128 samples limiting reliable knowledge extraction.
The Rashōmon effect poses challenges for deriving reliable knowledge from machine learning models. This study examined the influence of sample size on explanations from models in a Rashōmon set using SHAP. Experiments on 5 public datasets showed that explanations gradually converged as the sample size increased. Explanations from <128 samples exhibited high variability, limiting reliable knowledge extraction. However, agreement between models improved with more data, allowing for consensus. Bagging ensembles often had higher agreement. The results provide guidance on sufficient data to trust explanations. Variability at low samples suggests that conclusions may be unreliable without validation. Further work is needed with more model types, data domains, and explanation methods. Testing convergence in neural networks and with model-specific explanation methods would be impactful. The approaches explored here point towards principled techniques for eliciting knowledge from ambiguous models.