Visualizing textual models with in-text and word-as-pixel highlighting
This work addresses the interpretability challenge for researchers and practitioners using text models, though it appears incremental as it builds on existing visualization approaches.
The authors tackled the problem of interpreting statistical text models by developing two visualization techniques: in-text annotations for token-level insights and a 'words-as-pixels' graphic for corpus-level overviews. They demonstrated these methods by diagnosing a classifier's issues with Twitter slang and analyzing a topic model on historical political texts.
We explore two techniques which use color to make sense of statistical text models. One method uses in-text annotations to illustrate a model's view of particular tokens in particular documents. Another uses a high-level, "words-as-pixels" graphic to display an entire corpus. Together, these methods offer both zoomed-in and zoomed-out perspectives into a model's understanding of text. We show how these interconnected methods help diagnose a classifier's poor performance on Twitter slang, and make sense of a topic model on historical political texts.