xai_evals : A Framework for Evaluating Post-Hoc Local Explanation Methods
This work addresses the need for rigorous evaluation of explanation methods in high-stakes AI applications, though it is incremental as it builds on existing techniques.
The authors tackled the problem of evaluating post-hoc explanation methods for machine learning models by developing xai_evals, a Python framework that benchmarks techniques like SHAP and LIME, resulting in a tool that supports metrics such as faithfulness and robustness to enhance model transparency.
The growing complexity of machine learning and deep learning models has led to an increased reliance on opaque "black box" systems, making it difficult to understand the rationale behind predictions. This lack of transparency is particularly challenging in high-stakes applications where interpretability is as important as accuracy. Post-hoc explanation methods are commonly used to interpret these models, but they are seldom rigorously evaluated, raising concerns about their reliability. The Python package xai_evals addresses this by providing a comprehensive framework for generating, benchmarking, and evaluating explanation methods across both tabular and image data modalities. It integrates popular techniques like SHAP, LIME, Grad-CAM, Integrated Gradients (IG), and Backtrace, while supporting evaluation metrics such as faithfulness, sensitivity, and robustness. xai_evals enhances the interpretability of machine learning models, fostering transparency and trust in AI systems. The library is open-sourced at https://pypi.org/project/xai-evals/ .