CLJun 5, 2022

Multilingual Neural Machine Translation with Deep Encoder and Multiple Shallow Decoders

Meta AI
arXiv:2206.02079v1808 citationsh-index: 51
Originality Incremental advance
AI Analysis

This work addresses efficiency constraints in multilingual translation applications, offering a practical improvement for deployment scenarios.

The paper tackled the problem of high latency and memory costs in multilingual neural machine translation by proposing a deep encoder with multiple shallow decoders (DEMSD), achieving a 1.8x speedup on average without sacrificing translation quality compared to a standard transformer model.

Recent work in multilingual translation advances translation quality surpassing bilingual baselines using deep transformer models with increased capacity. However, the extra latency and memory costs introduced by this approach may make it unacceptable for efficiency-constrained applications. It has recently been shown for bilingual translation that using a deep encoder and shallow decoder (DESD) can reduce inference latency while maintaining translation quality, so we study similar speed-accuracy trade-offs for multilingual translation. We find that for many-to-one translation we can indeed increase decoder speed without sacrificing quality using this approach, but for one-to-many translation, shallow decoders cause a clear quality drop. To ameliorate this drop, we propose a deep encoder with multiple shallow decoders (DEMSD) where each shallow decoder is responsible for a disjoint subset of target languages. Specifically, the DEMSD model with 2-layer decoders is able to obtain a 1.8x speedup on average compared to a standard transformer model with no drop in translation quality.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes