CLAISDASDec 26, 2024

Indonesian-English Code-Switching Speech Synthesizer Utilizing Multilingual STEN-TTS and Bert LID

arXiv:2412.19043v1h-index: 36O-COCOSDA
Originality Synthesis-oriented
AI Analysis

This addresses a specific need for Indonesian speakers who frequently code-switch between Indonesian and English, representing an incremental improvement in domain-specific TTS.

The study tackled the lack of a multilingual text-to-speech system for Indonesian-English code-switching by modifying STEN-TTS with a BERT-based language identification component, resulting in superior naturalness and improved speech intelligibility compared to baseline models.

Multilingual text-to-speech systems convert text into speech across multiple languages. In many cases, text sentences may contain segments in different languages, a phenomenon known as code-switching. This is particularly common in Indonesia, especially between Indonesian and English. Despite its significance, no research has yet developed a multilingual TTS system capable of handling code-switching between these two languages. This study addresses Indonesian-English code-switching in STEN-TTS. Key modifications include adding a language identification component to the text-to-phoneme conversion using finetuned BERT for per-word language identification, as well as removing language embedding from the base model. Experimental results demonstrate that the code-switching model achieves superior naturalness and improved speech intelligibility compared to the Indonesian and English baseline STEN-TTS models.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes