CL AIMar 28, 2024

The Role of Syntactic Span Preferences in Post-Hoc Explanation Disagreement

Jonathan Kamp, Lisa Beinborn, Antske Fokkens

arXiv:2403.19424v124.183 citationsh-index: 6Has CodeLREC

Originality Incremental advance

AI Analysis

This work addresses the problem of inconsistent explanations in AI transparency for users, offering incremental improvements to enhance reliability.

The paper investigated why post-hoc explanation methods for model transparency produce diverging token importance patterns, finding that disagreements stem from systematic linguistic preferences and can be reduced by comparing explanations at the syntactic span level and using dynamic span selection instead of fixed-size subsets.

Post-hoc explanation methods are an important tool for increasing model transparency for users. Unfortunately, the currently used methods for attributing token importance often yield diverging patterns. In this work, we study potential sources of disagreement across methods from a linguistic perspective. We find that different methods systematically select different classes of words and that methods that agree most with other methods and with humans display similar linguistic preferences. Token-level differences between methods are smoothed out if we compare them on the syntactic span level. We also find higher agreement across methods by estimating the most important spans dynamically instead of relying on a fixed subset of size $k$. We systematically investigate the interaction between $k$ and spans and propose an improved configuration for selecting important tokens.

View on arXiv PDF Code

Similar