CLHCLGApr 30, 2022

Visualizing and Explaining Language Models

arXiv:2205.10238v17 citationsh-index: 15
Originality Synthesis-oriented
AI Analysis

This work tackles the problem of model interpretability for researchers and practitioners in NLP, but it is incremental as it surveys existing visualization methods rather than introducing new ones.

The paper addresses the challenge of interpreting and explaining black-box language models in NLP by reviewing visualization techniques that highlight salient words, clustering, and neuron activations to improve model transparency.

During the last decade, Natural Language Processing has become, after Computer Vision, the second field of Artificial Intelligence that was massively changed by the advent of Deep Learning. Regardless of the architecture, the language models of the day need to be able to process or generate text, as well as predict missing words, sentences or relations depending on the task. Due to their black-box nature, such models are difficult to interpret and explain to third parties. Visualization is often the bridge that language model designers use to explain their work, as the coloring of the salient words and phrases, clustering or neuron activations can be used to quickly understand the underlying models. This paper showcases the techniques used in some of the most popular Deep Learning for NLP visualizations, with a special focus on interpretability and explainability.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes