CLOct 16, 2022

Modeling Context With Linear Attention for Scalable Document-Level Translation

AI2MITUW
arXiv:2210.08431v1291 citationsh-index: 114Has Code
Originality Incremental advance
AI Analysis

This work addresses the problem of scaling document-level translation for researchers and practitioners by offering an incremental improvement over existing transformer-based methods.

The paper tackled the scalability issue of document-level machine translation by applying a linear attention model with a sentential gate, resulting in substantially increased decoding speed on long sequences with similar or better BLEU scores, such as improvements on IWSLT.

Document-level machine translation leverages inter-sentence dependencies to produce more coherent and consistent translations. However, these models, predominantly based on transformers, are difficult to scale to long documents as their attention layers have quadratic complexity in the sequence length. Recent efforts on efficient attention improve scalability, but their effect on document translation remains unexplored. In this work, we investigate the efficacy of a recent linear attention model by Peng et al. (2021) on document translation and augment it with a sentential gate to promote a recency inductive bias. We evaluate the model on IWSLT 2015 and OpenSubtitles 2018 against the transformer, demonstrating substantially increased decoding speed on long sequences with similar or better BLEU scores. We show that sentential gating further improves translation quality on IWSLT.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes