CLAIFeb 25, 2024

Hitting "Probe"rty with Non-Linearity, and More

arXiv:2402.16168v1h-index: 1
Originality Incremental advance
AI Analysis

This work is incremental, improving methods for analyzing syntactic information in language models, primarily benefiting researchers in NLP and interpretability.

The authors tackled the limitation of linear structural probes in fully capturing dependency tree information in language models by introducing non-linear structural probes, finding that the radial basis function (RBF) variant outperforms linear probes for BERT.

Structural probes learn a linear transformation to find how dependency trees are embedded in the hidden states of language models. This simple design may not allow for full exploitation of the structure of the encoded information. Hence, to investigate the structure of the encoded information to its full extent, we incorporate non-linear structural probes. We reformulate the design of non-linear structural probes introduced by White et al. making its design simpler yet effective. We also design a visualization framework that lets us qualitatively assess how strongly two words in a sentence are connected in the predicted dependency tree. We use this technique to understand which non-linear probe variant is good at encoding syntactical information. Additionally, we also use it to qualitatively investigate the structure of dependency trees that BERT encodes in each of its layers. We find that the radial basis function (RBF) is an effective non-linear probe for the BERT model than the linear probe.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes