On Sample Based Explanation Methods for NLP:Efficiency, Faithfulness, and Semantic Evaluation
This work addresses interpretability and efficiency issues in NLP explanation methods, offering a novel approach that improves alignment with human judgment, though it is incremental in its advancements.
The paper tackled the challenge of applying sample-based explanation methods to large NLP models by introducing a method that uses arbitrary text sequences as explanation units, implements a Hessian-free approach with faithfulness guarantees, and proposes a semantic-based evaluation metric. Empirical results on multiple datasets show superior performance over methods like Influence Function and TracIn in semantic evaluation.
In the recent advances of natural language processing, the scale of the state-of-the-art models and datasets is usually extensive, which challenges the application of sample-based explanation methods in many aspects, such as explanation interpretability, efficiency, and faithfulness. In this work, for the first time, we can improve the interpretability of explanations by allowing arbitrary text sequences as the explanation unit. On top of this, we implement a hessian-free method with a model faithfulness guarantee. Finally, to compare our method with the others, we propose a semantic-based evaluation metric that can better align with humans' judgment of explanations than the widely adopted diagnostic or re-training measures. The empirical results on multiple real data sets demonstrate the proposed method's superior performance to popular explanation techniques such as Influence Function or TracIn on semantic evaluation.