CLLGSDASOct 2, 2019

Speech-to-speech Translation between Untranscribed Unknown Languages

arXiv:1910.00795v255 citations
Originality Highly original
AI Analysis

This addresses the problem of translating between low-resource or unwritten languages for speakers and researchers, representing a novel approach rather than an incremental improvement.

The paper tackles speech-to-speech translation between languages without transcriptions or linguistic supervision by using unsupervised term discovery and a sequence-to-sequence model, achieving direct translation without auxiliary steps.

In this paper, we explore a method for training speech-to-speech translation tasks without any transcription or linguistic supervision. Our proposed method consists of two steps: First, we train and generate discrete representation with unsupervised term discovery with a discrete quantized autoencoder. Second, we train a sequence-to-sequence model that directly maps the source language speech to the target language's discrete representation. Our proposed method can directly generate target speech without any auxiliary or pre-training steps with a source or target transcription. To the best of our knowledge, this is the first work that performed pure speech-to-speech translation between untranscribed unknown languages.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes