CLSDASJun 5, 2023

Cross-Lingual Transfer Learning for Phrase Break Prediction with Multilingual Language Model

arXiv:2306.02579v13 citationsh-index: 9
Originality Incremental advance
AI Analysis

This addresses the problem of limited labeled data for text-to-speech systems in low-resource languages, though it is incremental as it applies existing transfer methods to a specific task.

The paper tackled phrase break prediction for low-resource languages by using cross-lingual transfer learning with a multilingual language model, showing it is effective in few-shot settings to improve performance.

Phrase break prediction is a crucial task for improving the prosody naturalness of a text-to-speech (TTS) system. However, most proposed phrase break prediction models are monolingual, trained exclusively on a large amount of labeled data. In this paper, we address this issue for low-resource languages with limited labeled data using cross-lingual transfer. We investigate the effectiveness of zero-shot and few-shot cross-lingual transfer for phrase break prediction using a pre-trained multilingual language model. We use manually collected datasets in four Indo-European languages: one high-resource language and three with limited resources. Our findings demonstrate that cross-lingual transfer learning can be a particularly effective approach, especially in the few-shot setting, for improving performance in low-resource languages. This suggests that cross-lingual transfer can be inexpensive and effective for developing TTS front-end in resource-poor languages.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes