CLJul 14, 2021

Importance-based Neuron Allocation for Multilingual Neural Machine Translation

arXiv:2107.06569v1716 citations
Originality Incremental advance
AI Analysis

This addresses the parameter explosion and manual design issues in multilingual translation for researchers and practitioners, though it is incremental as it builds on existing module-based approaches.

The paper tackles the problem of multilingual neural machine translation models ignoring language-specific knowledge by proposing a method to allocate neurons into general and language-specific parts based on importance across languages, achieving effectiveness and universality in experiments on IWSLT and Europarl datasets.

Multilingual neural machine translation with a single model has drawn much attention due to its capability to deal with multiple languages. However, the current multilingual translation paradigm often makes the model tend to preserve the general knowledge, but ignore the language-specific knowledge. Some previous works try to solve this problem by adding various kinds of language-specific modules to the model, but they suffer from the parameter explosion problem and require specialized manual design. To solve these problems, we propose to divide the model neurons into general and language-specific parts based on their importance across languages. The general part is responsible for preserving the general knowledge and participating in the translation of all the languages, while the language-specific part is responsible for preserving the language-specific knowledge and participating in the translation of some specific languages. Experimental results on several language pairs, covering IWSLT and Europarl corpus datasets, demonstrate the effectiveness and universality of the proposed method.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes