CLOct 18, 2024

Efficiently Computing Susceptibility to Context in Language Models

arXiv:2410.14361v122 citationsh-index: 40EMNLP
Originality Incremental advance
AI Analysis

This work addresses the need for more efficient tools to analyze context sensitivity in language models, which is incremental but important for researchers and practitioners.

The paper tackles the problem of efficiently measuring how sensitive language models are to changes in context, proposing Fisher susceptibility as a method that is 70 times faster than the existing Monte Carlo approximation while yielding comparable results.

One strength of modern language models is their ability to incorporate information from a user-input context when answering queries. However, they are not equally sensitive to the subtle changes to that context. To quantify this, Du et al. (2024) gives an information-theoretic metric to measure such sensitivity. Their metric, susceptibility, is defined as the degree to which contexts can influence a model's response to a query at a distributional level. However, exactly computing susceptibility is difficult and, thus, Du et al. (2024) falls back on a Monte Carlo approximation. Due to the large number of samples required, the Monte Carlo approximation is inefficient in practice. As a faster alternative, we propose Fisher susceptibility, an efficient method to estimate the susceptibility based on Fisher information. Empirically, we validate that Fisher susceptibility is comparable to Monte Carlo estimated susceptibility across a diverse set of query domains despite its being $70\times$ faster. Exploiting the improved efficiency, we apply Fisher susceptibility to analyze factors affecting the susceptibility of language models. We observe that larger models are as susceptible as smaller ones.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes