CLLGAug 17, 2024

Improving Rare Word Translation With Dictionaries and Attention Masking

arXiv:2408.09075v224 citationsh-index: 2
Originality Incremental advance
AI Analysis

This addresses translation accuracy for rare words in low-resource and out-of-domain settings, representing an incremental improvement.

The paper tackled the problem of rare word translation in machine translation by appending bilingual dictionary definitions to source sentences and using attention masking, resulting in improvements of up to 1.0 BLEU and 1.6 MacroF1.

In machine translation, rare words continue to be a problem for the dominant encoder-decoder architecture, especially in low-resource and out-of-domain translation settings. Human translators solve this problem with monolingual or bilingual dictionaries. In this paper, we propose appending definitions from a bilingual dictionary to source sentences and using attention masking to link together rare words with their definitions. We find that including definitions for rare words improves performance by up to 1.0 BLEU and 1.6 MacroF1.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes