CLOct 6, 2023

Measuring Information in Text Explanations

U of Toronto

arXiv:2310.04557v10.91 citationsh-index: 43

Originality Incremental advance

AI Analysis

This work addresses the need for standardized evaluation criteria in explainable AI, though it is incremental as it builds on existing methods without introducing a new paradigm.

The authors tackled the problem of evaluating text explanations in explainable AI by proposing an information-theoretic framework to unify assessments of rationale and natural language explanations, revealing that NLEs trade off input-related and target-related information while rationales do not.

Text-based explanation is a particularly promising approach in explainable AI, but the evaluation of text explanations is method-dependent. We argue that placing the explanations on an information-theoretic framework could unify the evaluations of two popular text explanation methods: rationale and natural language explanations (NLE). This framework considers the post-hoc text pipeline as a series of communication channels, which we refer to as ``explanation channels''. We quantify the information flow through these channels, thereby facilitating the assessment of explanation characteristics. We set up tools for quantifying two information scores: relevance and informativeness. We illustrate what our proposed information scores measure by comparing them against some traditional evaluation metrics. Our information-theoretic scores reveal some unique observations about the underlying mechanisms of two representative text explanations. For example, the NLEs trade-off slightly between transmitting the input-related information and the target-related information, whereas the rationales do not exhibit such a trade-off mechanism. Our work contributes to the ongoing efforts in establishing rigorous and standardized evaluation criteria in the rapidly evolving field of explainable AI.

View on arXiv PDF

Similar