CLMay 23, 2022

A Fine-grained Interpretability Evaluation Benchmark for Neural NLP

arXiv:2205.11097v2298 citationsh-index: 52
Originality Incremental advance
AI Analysis

This provides a standardized evaluation tool for researchers working on interpretability in NLP, though it is incremental as it builds on existing concerns about model transparency.

The authors tackled the problem of evaluating interpretability in neural NLP models by creating a benchmark with token-level rationales for three tasks in English and Chinese, and they introduced a consistency metric to assess interpretability across tasks, revealing strengths and weaknesses of models and saliency methods.

While there is increasing concern about the interpretability of neural models, the evaluation of interpretability remains an open problem, due to the lack of proper evaluation datasets and metrics. In this paper, we present a novel benchmark to evaluate the interpretability of both neural models and saliency methods. This benchmark covers three representative NLP tasks: sentiment analysis, textual similarity and reading comprehension, each provided with both English and Chinese annotated data. In order to precisely evaluate the interpretability, we provide token-level rationales that are carefully annotated to be sufficient, compact and comprehensive. We also design a new metric, i.e., the consistency between the rationales before and after perturbations, to uniformly evaluate the interpretability on different types of tasks. Based on this benchmark, we conduct experiments on three typical models with three saliency methods, and unveil their strengths and weakness in terms of interpretability. We will release this benchmark https://www.luge.ai/#/luge/task/taskDetail?taskId=15 and hope it can facilitate the research in building trustworthy systems.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes