CLDec 21, 2022

Reconstruction Probing

Najoung Kim, Jatin Khilnani, Alex Warstadt, Abed Qaddoumi

arXiv:2212.10792v121.4223 citationsh-index: 21

Originality Incremental advance

AI Analysis

This work provides a new analysis tool for understanding contextualized representations in NLP, which is incremental as it builds on existing MLM frameworks.

The authors tackled the problem of analyzing contextualized representations in masked language models by introducing reconstruction probing, a method that quantifies how contextualization affects token reconstruction probabilities. They found that contextualization boosts reconstructability for tokens close in linear and syntactic distance, with these effects primarily due to static and positional embeddings at the input layer.

We propose reconstruction probing, a new analysis method for contextualized representations based on reconstruction probabilities in masked language models (MLMs). This method relies on comparing the reconstruction probabilities of tokens in a given sequence when conditioned on the representation of a single token that has been fully contextualized and when conditioned on only the decontextualized lexical prior of the model. This comparison can be understood as quantifying the contribution of contextualization towards reconstruction -- the difference in the reconstruction probabilities can only be attributed to the representational change of the single token induced by contextualization. We apply this analysis to three MLMs and find that contextualization boosts reconstructability of tokens that are close to the token being reconstructed in terms of linear and syntactic distance. Furthermore, we extend our analysis to finer-grained decomposition of contextualized representations, and we find that these boosts are largely attributable to static and positional embeddings at the input layer.

View on arXiv PDF

Similar