CLAIHCLGOct 2, 2023

Quantifying the Plausibility of Context Reliance in Neural Machine Translation

arXiv:2310.01188v28 citationsh-index: 35
Originality Incremental advance
AI Analysis

This work addresses the trustworthiness of language models in real-world settings by providing a method to evaluate context usage, though it is incremental as it builds on existing interpretability techniques.

The authors tackled the problem of evaluating whether language models use contextual information plausibly by introducing PECoRe, an interpretability framework that quantifies context reliance in machine translation, showing it can identify and compare model rationales with human annotations across discourse phenomena.

Establishing whether language models can use contextual information in a human-plausible way is important to ensure their trustworthiness in real-world settings. However, the questions of when and which parts of the context affect model generations are typically tackled separately, with current plausibility evaluations being practically limited to a handful of artificial benchmarks. To address this, we introduce Plausibility Evaluation of Context Reliance (PECoRe), an end-to-end interpretability framework designed to quantify context usage in language models' generations. Our approach leverages model internals to (i) contrastively identify context-sensitive target tokens in generated texts and (ii) link them to contextual cues justifying their prediction. We use \pecore to quantify the plausibility of context-aware machine translation models, comparing model rationales with human annotations across several discourse-level phenomena. Finally, we apply our method to unannotated model translations to identify context-mediated predictions and highlight instances of (im)plausible context usage throughout generation.

Code Implementations4 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes