CLSep 10, 2021

Rule-based Morphological Inflection Improves Neural Terminology Translation

arXiv:2109.04620v2662 citations
AI Analysis

This addresses a limitation in real-world machine translation where constraint terms are often provided as lemmas, offering a more flexible and cost-effective solution for domain adaptation and low-resource settings.

The paper tackles the problem of incorporating lemma constraints in neural machine translation by introducing a modular framework with a cross-lingual inflection module, showing that a rule-based inflection module improves accuracy over neural and end-to-end approaches with lower training costs.

Current approaches to incorporating terminology constraints in machine translation (MT) typically assume that the constraint terms are provided in their correct morphological forms. This limits their application to real-world scenarios where constraint terms are provided as lemmas. In this paper, we introduce a modular framework for incorporating lemma constraints in neural MT (NMT) in which linguistic knowledge and diverse types of NMT models can be flexibly applied. It is based on a novel cross-lingual inflection module that inflects the target lemma constraints based on the source context. We explore linguistically motivated rule-based and data-driven neural-based inflection modules and design English-German health and English-Lithuanian news test suites to evaluate them in domain adaptation and low-resource MT settings. Results show that our rule-based inflection module helps NMT models incorporate lemma constraints more accurately than a neural module and outperforms the existing end-to-end approach with lower training costs.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes