AICLLGApr 25, 2022

Integrating Prior Knowledge in Post-hoc Explanations

arXiv:2204.11634v18 citationsh-index: 23
Originality Incremental advance
AI Analysis

This work addresses the need for more tailored explanations in AI systems for users, though it is incremental as it builds on existing counterfactual explanation methods.

The paper tackles the problem of improving post-hoc explanations in XAI by integrating user-specific prior knowledge, resulting in a new method called KICE that generates more understandable and personalized counterfactual explanations, with experimental validation on benchmark datasets.

In the field of eXplainable Artificial Intelligence (XAI), post-hoc interpretability methods aim at explaining to a user the predictions of a trained decision model. Integrating prior knowledge into such interpretability methods aims at improving the explanation understandability and allowing for personalised explanations adapted to each user. In this paper, we propose to define a cost function that explicitly integrates prior knowledge into the interpretability objectives: we present a general framework for the optimization problem of post-hoc interpretability methods, and show that user knowledge can thus be integrated to any method by adding a compatibility term in the cost function. We instantiate the proposed formalization in the case of counterfactual explanations and propose a new interpretability method called Knowledge Integration in Counterfactual Explanation (KICE) to optimize it. The paper performs an experimental study on several benchmark data sets to characterize the counterfactual instances generated by KICE, as compared to reference methods.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes