AS CL SDMar 25, 2024

Advancing Speech Translation: A Corpus of Mandarin-English Conversational Telephone Speech

Shannon Wotherspoon, William Hartmann, Matthew Snover

arXiv:2404.11619v13.32 citationsh-index: 17

Originality Incremental advance

AI Analysis

This provides a domain-specific resource for speech translation, addressing a data bottleneck for researchers and practitioners in this area.

The paper tackles the lack of paired speech-text data for Mandarin-English conversational telephone speech by introducing a 123-hour corpus, and fine-tuning a general-purpose translation model on this dataset improves target-domain BLEU by over 8 points.

This paper introduces a set of English translations for a 123-hour subset of the CallHome Mandarin Chinese data and the HKUST Mandarin Telephone Speech data for the task of speech translation. Paired source-language speech and target-language text is essential for training end-to-end speech translation systems and can provide substantial performance improvements for cascaded systems as well, relative to training on more widely available text data sets. We demonstrate that fine-tuning a general-purpose translation model to our Mandarin-English conversational telephone speech training set improves target-domain BLEU by more than 8 points, highlighting the importance of matched training data.

View on arXiv PDF

Similar