The Language Interpretability Tool: Extensible, Interactive Visualizations and Analysis for NLP Models
This tool addresses the need for interpretability in NLP models for researchers and practitioners, though it is incremental as it builds on existing visualization and analysis methods.
They tackled the problem of understanding NLP model behavior by developing the Language Interpretability Tool (LIT), an open-source platform that integrates visualizations, explanations, and analysis to enable rapid exploration and error analysis for tasks like sentiment analysis and bias measurement.
We present the Language Interpretability Tool (LIT), an open-source platform for visualization and understanding of NLP models. We focus on core questions about model behavior: Why did my model make this prediction? When does it perform poorly? What happens under a controlled change in the input? LIT integrates local explanations, aggregate analysis, and counterfactual generation into a streamlined, browser-based interface to enable rapid exploration and error analysis. We include case studies for a diverse set of workflows, including exploring counterfactuals for sentiment analysis, measuring gender bias in coreference systems, and exploring local behavior in text generation. LIT supports a wide range of models--including classification, seq2seq, and structured prediction--and is highly extensible through a declarative, framework-agnostic API. LIT is under active development, with code and full documentation available at https://github.com/pair-code/lit.