CLSep 1, 2019

Higher-order Comparisons of Sentence Encoder Representations

arXiv:1909.00303v21004 citations
Originality Incremental advance
AI Analysis

This provides a more transparent and robust method for interpretability in NLP, addressing issues like overfitting and data requirements in existing approaches.

The paper tackled the problem of comparing sentence encoder representations by applying Representational Similarity Analysis (RSA), a technique from neuroscience, to language models, and found a previously unknown correspondence between pretrained encoders and human eye-tracking data.

Representational Similarity Analysis (RSA) is a technique developed by neuroscientists for comparing activity patterns of different measurement modalities (e.g., fMRI, electrophysiology, behavior). As a framework, RSA has several advantages over existing approaches to interpretation of language encoders based on probing or diagnostic classification: namely, it does not require large training samples, is not prone to overfitting, and it enables a more transparent comparison between the representational geometries of different models and modalities. We demonstrate the utility of RSA by establishing a previously unknown correspondence between widely-employed pretrained language encoders and human processing difficulty via eye-tracking data, showcasing its potential in the interpretability toolbox for neural models

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes