CLAug 31, 2021

Thermostat: A Large Collection of NLP Model Explanations and Analysis Tools

arXiv:2108.13961v1661 citationsHas Code
Originality Synthesis-oriented
AI Analysis

This work addresses the problem of accessibility and comparability in explainability research for NLP practitioners, though it is incremental as it builds on existing methods by providing a curated dataset and tools.

The authors tackled the challenge of resource-intensive and expert-dependent neural explainability in NLP by creating Thermostat, a large collection of over 200k model explanations and analysis tools, which saved over 10k GPU hours of computation for the community.

In the language domain, as in other domains, neural explainability takes an ever more important role, with feature attribution methods on the forefront. Many such methods require considerable computational resources and expert knowledge about implementation details and parameter choices. To facilitate research, we present Thermostat which consists of a large collection of model explanations and accompanying analysis tools. Thermostat allows easy access to over 200k explanations for the decisions of prominent state-of-the-art models spanning across different NLP tasks, generated with multiple explainers. The dataset took over 10k GPU hours (> one year) to compile; compute time that the community now saves. The accompanying software tools allow to analyse explanations instance-wise but also accumulatively on corpus level. Users can investigate and compare models, datasets and explainers without the need to orchestrate implementation details. Thermostat is fully open source, democratizes explainability research in the language domain, circumvents redundant computations and increases comparability and replicability.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes