Higher-order Comparisons of Sentence Encoder Representations
This provides a more transparent and robust method for interpretability in NLP, addressing issues like overfitting and data requirements in existing approaches.
The paper tackled the problem of comparing sentence encoder representations by applying Representational Similarity Analysis (RSA), a technique from neuroscience, to language models, and found a previously unknown correspondence between pretrained encoders and human eye-tracking data.
Representational Similarity Analysis (RSA) is a technique developed by neuroscientists for comparing activity patterns of different measurement modalities (e.g., fMRI, electrophysiology, behavior). As a framework, RSA has several advantages over existing approaches to interpretation of language encoders based on probing or diagnostic classification: namely, it does not require large training samples, is not prone to overfitting, and it enables a more transparent comparison between the representational geometries of different models and modalities. We demonstrate the utility of RSA by establishing a previously unknown correspondence between widely-employed pretrained language encoders and human processing difficulty via eye-tracking data, showcasing its potential in the interpretability toolbox for neural models