Developing a Fidelity Evaluation Approach for Interpretable Machine Learning
This work addresses the need for better evaluation metrics in interpretable machine learning, though it is incremental as it adapts existing methods to a new domain.
The paper tackled the problem of evaluating the fidelity of explainable AI methods for tabular data, proposing a three-phase approach and adapting an existing method, but found no clear superiority among methods due to sensitivity to context.
Although modern machine learning and deep learning methods allow for complex and in-depth data analytics, the predictive models generated by these methods are often highly complex, and lack transparency. Explainable AI (XAI) methods are used to improve the interpretability of these complex models, and in doing so improve transparency. However, the inherent fitness of these explainable methods can be hard to evaluate. In particular, methods to evaluate the fidelity of the explanation to the underlying black box require further development, especially for tabular data. In this paper, we (a) propose a three phase approach to developing an evaluation method; (b) adapt an existing evaluation method primarily for image and text data to evaluate models trained on tabular data; and (c) evaluate two popular explainable methods using this evaluation method. Our evaluations suggest that the internal mechanism of the underlying predictive model, the internal mechanism of the explainable method used and model and data complexity all affect explanation fidelity. Given that explanation fidelity is so sensitive to context and tools and data used, we could not clearly identify any specific explainable method as being superior to another.