CLAILGAug 22, 2022

Interpreting Embedding Spaces by Conceptualization

arXiv:2209.00445v3133 citationsh-index: 33Has Code
AI Analysis

This addresses the need for human interpretability in embedding spaces, which is crucial for debugging, comparing models, and detecting biases, though it is incremental as it builds on existing embedding methods.

The paper tackles the problem of incomprehensible embedding spaces from large language models by proposing a method to transform them into a comprehensible conceptual space with dynamic granularity, showing through evaluation that the conceptualized vectors effectively represent the original semantics.

One of the main methods for computational interpretation of a text is mapping it into a vector in some embedding space. Such vectors can then be used for a variety of textual processing tasks. Recently, most embedding spaces are a product of training large language models (LLMs). One major drawback of this type of representation is their incomprehensibility to humans. Understanding the embedding space is crucial for several important needs, including the need to debug the embedding method and compare it to alternatives, and the need to detect biases hidden in the model. In this paper, we present a novel method of understanding embeddings by transforming a latent embedding space into a comprehensible conceptual space. We present an algorithm for deriving a conceptual space with dynamic on-demand granularity. We devise a new evaluation method, using either human rater or LLM-based raters, to show that the conceptualized vectors indeed represent the semantics of the original latent ones. We show the use of our method for various tasks, including comparing the semantics of alternative models and tracing the layers of the LLM. The code is available online https://github.com/adiSimhi/Interpreting-Embedding-Spaces-by-Conceptualization.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes