LG AS SPFeb 10, 2020

Accelerating RNN Transducer Inference via One-Step Constrained Beam Search

arXiv:2002.03577v128 citations

AI Analysis

This work addresses inference efficiency for speech recognition systems, but it is incremental as it builds on existing RNN-T methods.

The authors tackled the slow inference speed of RNN transducer beam search by proposing a one-step constrained beam search that eliminates a while-loop through vectorization and pruning, achieving significant speedup with lower phoneme and word error rates.

We propose a one-step constrained (OSC) beam search to accelerate recurrent neural network (RNN) transducer (RNN-T) inference. The original RNN-T beam search has a while-loop leading to speed down of the decoding process. The OSC beam search eliminates this while-loop by vectorizing multiple hypotheses. This vectorization is nontrivial as the expansion of the hypotheses within the original RNN-T beam search can be different from each other. However, we found that the hypotheses expanded only once at each decoding step in most cases; thus, we constrained the maximum expansion number to one, thereby allowing vectorization of the hypotheses. For further acceleration, we assign constraints to the prefixes of the hypotheses to prune the redundant search space. In addition, OSC beam search has duplication check among hypotheses during the decoding process as duplication can undesirably shrink the search space. We achieved significant speedup compared with other RNN-T beam search methods with lower phoneme and word error rate.

View on arXiv PDF

Similar