Accelerating NMT Batched Beam Decoding with LMBR Posteriors for Deployment
This work addresses deployment efficiency for NMT systems, but it appears incremental as it builds on existing LMBR and Transformer methods.
The paper tackled the problem of accelerating batched beam decoding for neural machine translation (NMT) by integrating LMBR n-gram posteriors, showing that LMBR techniques yield gains on top of recent Transformer results, with concrete numbers implied but not specified in the abstract.
We describe a batched beam decoding algorithm for NMT with LMBR n-gram posteriors, showing that LMBR techniques still yield gains on top of the best recently reported results with Transformers. We also discuss acceleration strategies for deployment, and the effect of the beam size and batching on memory and speed.