CLSDASMay 12, 2023

Improving Cascaded Unsupervised Speech Translation with Denoising Back-translation

arXiv:2305.07455v16 citations
Originality Incremental advance
AI Analysis

This addresses the challenge of data scarcity in speech translation for low-resource languages, though it is incremental as it builds on existing unsupervised techniques.

The paper tackles the problem of speech translation for low-resource languages without parallel data by proposing a cascaded unsupervised system with denoising back-translation, achieving BLEU score improvements of 0.7-0.9 in translation directions and comparable results to some supervised methods.

Most of the speech translation models heavily rely on parallel data, which is hard to collect especially for low-resource languages. To tackle this issue, we propose to build a cascaded speech translation system without leveraging any kind of paired data. We use fully unpaired data to train our unsupervised systems and evaluate our results on CoVoST 2 and CVSS. The results show that our work is comparable with some other early supervised methods in some language pairs. While cascaded systems always suffer from severe error propagation problems, we proposed denoising back-translation (DBT), a novel approach to building robust unsupervised neural machine translation (UNMT). DBT successfully increases the BLEU score by 0.7--0.9 in all three translation directions. Moreover, we simplified the pipeline of our cascaded system to reduce inference latency and conducted a comprehensive analysis of every part of our work. We also demonstrate our unsupervised speech translation results on the established website.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes