CL SD ASJun 25, 2024

Leveraging Synthetic Audio Data for End-to-End Low-Resource Speech Translation

arXiv:2406.17363v215.932 citations

Originality Synthesis-oriented

AI Analysis

This addresses the problem of speech translation for low-resource languages like Irish, but it is incremental as it applies existing methods to a new dataset.

The paper tackled low-resource speech translation from Irish to English by building end-to-end systems based on Whisper and using data augmentation techniques like speech back-translation and noise augmentation, resulting in a system submission to IWSLT 2024.

This paper describes our system submission to the International Conference on Spoken Language Translation (IWSLT 2024) for Irish-to-English speech translation. We built end-to-end systems based on Whisper, and employed a number of data augmentation techniques, such as speech back-translation and noise augmentation. We investigate the effect of using synthetic audio data and discuss several methods for enriching signal diversity.

View on arXiv PDF

Similar