CLDec 30, 2020

Introducing Orthogonal Constraint in Structural Probes

arXiv:2012.15228v2714 citations
AI Analysis

This work provides an incremental improvement to structural probing, a method for interpreting pre-trained NLP models, by making it less vulnerable to memorization.

This paper introduces a new structural probing method that decomposes linear projection into isomorphic space rotation and linear scaling to identify and scale relevant dimensions. The method is evaluated on syntactic dependency, lexical hypernymy, and sentence position tasks, demonstrating separation of lexical and syntactic information and reduced memorization.

With the recent success of pre-trained models in NLP, a significant focus was put on interpreting their representations. One of the most prominent approaches is structural probing (Hewitt and Manning, 2019), where a linear projection of word embeddings is performed in order to approximate the topology of dependency structures. In this work, we introduce a new type of structural probing, where the linear projection is decomposed into 1. isomorphic space rotation; 2. linear scaling that identifies and scales the most relevant dimensions. In addition to syntactic dependency, we evaluate our method on novel tasks (lexical hypernymy and position in a sentence). We jointly train the probes for multiple tasks and experimentally show that lexical and syntactic information is separated in the representations. Moreover, the orthogonal constraint makes the Structural Probes less vulnerable to memorization.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes