CL AI HC LGOct 2, 2023

Quantifying the Plausibility of Context Reliance in Neural Machine Translation

Gabriele Sarti, Grzegorz Chrupała, Malvina Nissim, Arianna Bisazza

arXiv:2310.01188v23.98 citationsh-index: 35Has Code

Originality Incremental advance

AI Analysis

This work addresses the trustworthiness of language models in real-world settings by providing a method to evaluate context usage, though it is incremental as it builds on existing interpretability techniques.

The authors tackled the problem of evaluating whether language models use contextual information plausibly by introducing PECoRe, an interpretability framework that quantifies context reliance in machine translation, showing it can identify and compare model rationales with human annotations across discourse phenomena.

Establishing whether language models can use contextual information in a human-plausible way is important to ensure their trustworthiness in real-world settings. However, the questions of when and which parts of the context affect model generations are typically tackled separately, with current plausibility evaluations being practically limited to a handful of artificial benchmarks. To address this, we introduce Plausibility Evaluation of Context Reliance (PECoRe), an end-to-end interpretability framework designed to quantify context usage in language models' generations. Our approach leverages model internals to (i) contrastively identify context-sensitive target tokens in generated texts and (ii) link them to contextual cues justifying their prediction. We use \pecore to quantify the plausibility of context-aware machine translation models, comparing model rationales with human annotations across several discourse-level phenomena. Finally, we apply our method to unannotated model translations to identify context-mediated predictions and highlight instances of (im)plausible context usage throughout generation.

View on arXiv PDF Code

Similar