CLMay 15, 2021

The Low-Dimensional Linear Geometry of Contextualized Word Representations

arXiv:2105.07109v231.2675 citations

Originality Incremental advance

AI Analysis

This provides insights into the interpretability of black-box NLP models, though it is incremental as it builds on existing probing methods.

The study investigated how linguistic features are encoded in contextualized word representations from ELMo and BERT, finding that these features are organized in low-dimensional subspaces with hierarchical relationships and can be manipulated to alter model behavior.

Black-box probing models can reliably extract linguistic features like tense, number, and syntactic role from pretrained word representations. However, the manner in which these features are encoded in representations remains poorly understood. We present a systematic study of the linear geometry of contextualized word representations in ELMO and BERT. We show that a variety of linguistic features (including structured dependency relationships) are encoded in low-dimensional subspaces. We then refine this geometric picture, showing that there are hierarchical relations between the subspaces encoding general linguistic categories and more specific ones, and that low-dimensional feature encodings are distributed rather than aligned to individual neurons. Finally, we demonstrate that these linear subspaces are causally related to model behavior, and can be used to perform fine-grained manipulation of BERT's output distribution.

View on arXiv PDF

Similar