CLJul 25, 2024

Tracking linguistic information in transformer-based sentence embeddings through targeted sparsification

arXiv:2407.18119v130 citationsh-index: 5
Originality Synthesis-oriented
AI Analysis

This work addresses the problem of understanding information compression in sentence embeddings for researchers in NLP and explainable AI, though it is incremental as it builds on existing analyses of transformer models.

The study investigated how linguistic information like grammatical number and semantic roles is localized in transformer-based sentence embeddings, finding that such information is encoded in specific regions rather than being distributed across the entire embedding.

Analyses of transformer-based models have shown that they encode a variety of linguistic information from their textual input. While these analyses have shed a light on the relation between linguistic information on one side, and internal architecture and parameters on the other, a question remains unanswered: how is this linguistic information reflected in sentence embeddings? Using datasets consisting of sentences with known structure, we test to what degree information about chunks (in particular noun, verb or prepositional phrases), such as grammatical number, or semantic role, can be localized in sentence embeddings. Our results show that such information is not distributed over the entire sentence embedding, but rather it is encoded in specific regions. Understanding how the information from an input text is compressed into sentence embeddings helps understand current transformer models and help build future explainable neural models.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes