CLASOct 1, 2019

Multilingual End-to-End Speech Translation

arXiv:1910.00254v2100 citations
Originality Highly original
AI Analysis

This work addresses the challenge of speech translation for multiple languages, offering a novel approach that could improve efficiency and performance in multilingual communication systems.

The paper tackles the problem of multilingual end-to-end speech translation by proposing a universal sequence-to-sequence framework that directly translates speech in source languages to target languages, showing that multilingual models significantly outperform bilingual ones in one-to-many and many-to-many scenarios.

In this paper, we propose a simple yet effective framework for multilingual end-to-end speech translation (ST), in which speech utterances in source languages are directly translated to the desired target languages with a universal sequence-to-sequence architecture. While multilingual models have shown to be useful for automatic speech recognition (ASR) and machine translation (MT), this is the first time they are applied to the end-to-end ST problem. We show the effectiveness of multilingual end-to-end ST in two scenarios: one-to-many and many-to-many translations with publicly available data. We experimentally confirm that multilingual end-to-end ST models significantly outperform bilingual ones in both scenarios. The generalization of multilingual training is also evaluated in a transfer learning scenario to a very low-resource language pair. All of our codes and the database are publicly available to encourage further research in this emergent multilingual ST topic.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes