CL LGJan 30, 2023

Quantifying Context Mixing in Transformers

Hosein Mohebbi, Willem Zuidema, Grzegorz Chrupała, Afra Alishahi

arXiv:2301.12971v229.3284 citationsh-index: 25Has Code

Originality Incremental advance

AI Analysis

This work addresses the need for more faithful analysis methods in NLP for researchers studying Transformer models, though it is incremental as it builds on prior analysis techniques.

The authors tackled the problem of analyzing token interactions in Transformers by proposing Value Zeroing, a novel context mixing score that considers the entire encoder block, and demonstrated its superiority over existing methods through evaluations based on linguistic rationales, probing, and faithfulness analysis.

Self-attention weights and their transformed variants have been the main source of information for analyzing token-to-token interactions in Transformer-based models. But despite their ease of interpretation, these weights are not faithful to the models' decisions as they are only one part of an encoder, and other components in the encoder layer can have considerable impact on information mixing in the output representations. In this work, by expanding the scope of analysis to the whole encoder block, we propose Value Zeroing, a novel context mixing score customized for Transformers that provides us with a deeper understanding of how information is mixed at each encoder layer. We demonstrate the superiority of our context mixing score over other analysis methods through a series of complementary evaluations with different viewpoints based on linguistically informed rationales, probing, and faithfulness analysis.

View on arXiv PDF Code

Similar