CLApr 20, 2025

Disentangling Linguistic Features with Dimension-Wise Analysis of Vector Embeddings

arXiv:2504.14766v112 citationsh-index: 2Proceedings of the 5th Workshop on Trustworthy NLP (TrustNLP 2025)
Originality Incremental advance
AI Analysis

This work addresses the problem of interpretability in language models for researchers and developers, with potential applications in bias mitigation and responsible AI deployment, though it is incremental in nature.

The paper tackled the challenge of interpreting high-dimensional neural embeddings like BERT by proposing a framework to identify specific dimensions encoding distinct linguistic properties, using a new dataset and methods to show that properties such as negation are robustly encoded in certain dimensions.

Understanding the inner workings of neural embeddings, particularly in models such as BERT, remains a challenge because of their high-dimensional and opaque nature. This paper proposes a framework for uncovering the specific dimensions of vector embeddings that encode distinct linguistic properties (LPs). We introduce the Linguistically Distinct Sentence Pairs (LDSP-10) dataset, which isolates ten key linguistic features such as synonymy, negation, tense, and quantity. Using this dataset, we analyze BERT embeddings with various methods, including the Wilcoxon signed-rank test, mutual information, and recursive feature elimination, to identify the most influential dimensions for each LP. We introduce a new metric, the Embedding Dimension Impact (EDI) score, which quantifies the relevance of each embedding dimension to a LP. Our findings show that certain properties, such as negation and polarity, are robustly encoded in specific dimensions, while others, like synonymy, exhibit more complex patterns. This study provides insights into the interpretability of embeddings, which can guide the development of more transparent and optimized language models, with implications for model bias mitigation and the responsible deployment of AI systems.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes