PHRASED: Phrase Dictionary Biasing for Speech Translation
This addresses a specific problem in speech translation for handling rare phrases, offering incremental improvements over existing biasing techniques.
The paper tackles the challenge of correctly translating rare phrases in speech translation by proposing a phrase dictionary biasing method, which improves phrase list biasing by 21% for streaming models and boosts phrase recall by 85% in multimodal large language models.
Phrases are essential to understand the core concepts in conversations. However, due to their rare occurrence in training data, correct translation of phrases is challenging in speech translation tasks. In this paper, we propose a phrase dictionary biasing method to leverage pairs of phrases mapping from the source language to the target language. We apply the phrase dictionary biasing method to two types of widely adopted models, a transducer-based streaming speech translation model and a multimodal large language model. Experimental results show that the phrase dictionary biasing method outperforms phrase list biasing by 21% relatively for the streaming speech translation model. In addition, phrase dictionary biasing enables multimodal large language models to use external phrase information, achieving 85% relative improvement in phrase recall.