CLAIMar 20, 2021

Local Interpretations for Explainable Natural Language Processing: A Survey

arXiv:2103.11072v369 citations
AI Analysis

It addresses the need for transparency in black-box models for NLP practitioners, but is incremental as it synthesizes existing methods without introducing new techniques.

This survey tackles the problem of improving interpretability in deep neural networks for NLP tasks like machine translation and sentiment analysis by investigating various local interpretation methods, categorizing them into three approaches.

As the use of deep learning techniques has grown across various fields over the past decade, complaints about the opaqueness of the black-box models have increased, resulting in an increased focus on transparency in deep learning models. This work investigates various methods to improve the interpretability of deep neural networks for Natural Language Processing (NLP) tasks, including machine translation and sentiment analysis. We provide a comprehensive discussion on the definition of the term interpretability and its various aspects at the beginning of this work. The methods collected and summarised in this survey are only associated with local interpretation and are specifically divided into three categories: 1) interpreting the model's predictions through related input features; 2) interpreting through natural language explanation; 3) probing the hidden states of models and word representations.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes