SyntaxShap: Syntax-aware Explainability Method for Text Generation
This addresses the need for explainability in safety-critical domains using large language models, though it is an incremental improvement by adapting existing methods to text generation.
The paper tackles the problem of explaining text generation models by introducing SyntaxShap, a syntax-aware method that extends Shapley values to incorporate syntactic dependencies, and shows it produces more faithful and coherent explanations compared to state-of-the-art methods.
To harness the power of large language models in safety-critical domains, we need to ensure the explainability of their predictions. However, despite the significant attention to model interpretability, there remains an unexplored domain in explaining sequence-to-sequence tasks using methods tailored for textual data. This paper introduces SyntaxShap, a local, model-agnostic explainability method for text generation that takes into consideration the syntax in the text data. The presented work extends Shapley values to account for parsing-based syntactic dependencies. Taking a game theoric approach, SyntaxShap only considers coalitions constraint by the dependency tree. We adopt a model-based evaluation to compare SyntaxShap and its weighted form to state-of-the-art explainability methods adapted to text generation tasks, using diverse metrics including faithfulness, coherency, and semantic alignment of the explanations to the model. We show that our syntax-aware method produces explanations that help build more faithful and coherent explanations for predictions by autoregressive models. Confronted with the misalignment of human and AI model reasoning, this paper also highlights the need for cautious evaluation strategies in explainable AI.