CLLGSep 4, 2020

Going Beyond T-SNE: Exposing \texttt{whatlies} in Text Embeddings

arXiv:2009.02113v12 citationsHas Code
Originality Synthesis-oriented
AI Analysis

This toolkit addresses the need for better visualization and inspection of text embeddings for researchers and practitioners in NLP, though it is incremental as it builds on existing embedding methods and visualization techniques.

The authors tackled the problem of inspecting word and sentence embeddings by introducing whatlies, an open-source toolkit that provides a unified API for various embedding backends and combines vector arithmetic with visualization tools to make exploration more intuitive.

We introduce whatlies, an open source toolkit for visually inspecting word and sentence embeddings. The project offers a unified and extensible API with current support for a range of popular embedding backends including spaCy, tfhub, huggingface transformers, gensim, fastText and BytePair embeddings. The package combines a domain specific language for vector arithmetic with visualisation tools that make exploring word embeddings more intuitive and concise. It offers support for many popular dimensionality reduction techniques as well as many interactive visualisations that can either be statically exported or shared via Jupyter notebooks. The project documentation is available from https://rasahq.github.io/whatlies/.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes