CL LGSep 8, 2022

Knowledge Based Template Machine Translation In Low-Resource Setting

arXiv:2209.03554v10.3h-index: 21

Originality Incremental advance

AI Analysis

This work addresses the problem of rare word translation for machine translation systems in low-resource scenarios, representing an incremental improvement.

The paper tackled the challenge of translating named entities in low-resource neural machine translation by investigating tags and hypernyms from knowledge graphs, finding that a soft tagging mechanism improved translation consistently across high and low-resource settings.

Incorporating tagging into neural machine translation (NMT) systems has shown promising results in helping translate rare words such as named entities (NE). However, translating NE in low-resource setting remains a challenge. In this work, we investigate the effect of using tags and NE hypernyms from knowledge graphs (KGs) in parallel corpus in different levels of resource conditions. We find the tag-and-copy mechanism (tag the NEs in the source sentence and copy them to the target sentence) improves translation in high-resource settings only. Introducing copying also results in polarizing effects in translating different parts-of-speech (POS). Interestingly, we find that copy accuracy for hypernyms is consistently higher than that of entities. As a way of avoiding "hard" copying and utilizing hypernym in bootstrapping rare entities, we introduced a "soft" tagging mechanism and found consistent improvement in high and low-resource settings.

View on arXiv PDF

Similar