CLApr 9, 2022

Towards Better Chinese-centric Neural Machine Translation for Low-resource Languages

arXiv:2204.04344v131 citationsh-index: 12Has Code
Originality Incremental advance
AI Analysis

This work addresses the need for Chinese-centric translation systems for low-resource languages, which is important for economic and cultural exchanges but has been understudied compared to English-centric approaches.

The paper tackled the problem of building neural machine translation systems for low-resource languages centered on Chinese, rather than English, and achieved better performance than state-of-the-art methods by leveraging techniques like monolingual word embeddings enhancement, bilingual curriculum learning, contrastive re-ranking, and a new Incomplete-Trust loss function.

The last decade has witnessed enormous improvements in science and technology, stimulating the growing demand for economic and cultural exchanges in various countries. Building a neural machine translation (NMT) system has become an urgent trend, especially in the low-resource setting. However, recent work tends to study NMT systems for low-resource languages centered on English, while few works focus on low-resource NMT systems centered on other languages such as Chinese. To achieve this, the low-resource multilingual translation challenge of the 2021 iFLYTEK AI Developer Competition provides the Chinese-centric multilingual low-resource NMT tasks, where participants are required to build NMT systems based on the provided low-resource samples. In this paper, we present the winner competition system that leverages monolingual word embeddings data enhancement, bilingual curriculum learning, and contrastive re-ranking. In addition, a new Incomplete-Trust (In-trust) loss function is proposed to replace the traditional cross-entropy loss when training. The experimental results demonstrate that the implementation of these ideas leads better performance than other state-of-the-art methods. All the experimental codes are released at: https://github.com/WENGSYX/Low-resource-text-translation.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes