CLSDASNov 18, 2022

Dialogs Re-enacted Across Languages

arXiv:2211.11584v22 citationsh-index: 23
Originality Synthesis-oriented
AI Analysis

This work addresses the need for bilingual dialog data for researchers in speech-to-speech translation, though it is incremental as it focuses on data collection rather than novel methods.

The authors tackled the problem of collecting closely matched bilingual dialog data to support cross-language prosodic mapping and speech-to-speech translation improvements, resulting in a publicly released corpus and protocol for data collection.

To support machine learning of cross-language prosodic mappings and other ways to improve speech-to-speech translation, we present a protocol for collecting closely matched pairs of utterances across languages, a description of the resulting data collection and its public release, and some observations and musings. This report is intended for: people using this corpus, people extending this corpus, and people designing similar collections of bilingual dialog data.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes