CLOct 6, 2020

On the Sparsity of Neural Machine Translation Models

arXiv:2010.02646v11000 citations
Originality Incremental advance
AI Analysis

This addresses computational inefficiency in NMT for researchers and practitioners, but it is incremental as it builds on existing pruning and reallocation techniques.

The paper tackles the problem of over-parameterization in neural machine translation models by investigating whether redundant parameters can be reused to improve performance, resulting in up to +0.8 BLEU point gains.

Modern neural machine translation (NMT) models employ a large number of parameters, which leads to serious over-parameterization and typically causes the underutilization of computational resources. In response to this problem, we empirically investigate whether the redundant parameters can be reused to achieve better performance. Experiments and analyses are systematically conducted on different datasets and NMT architectures. We show that: 1) the pruned parameters can be rejuvenated to improve the baseline model by up to +0.8 BLEU points; 2) the rejuvenated parameters are reallocated to enhance the ability of modeling low-level lexical information.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes