CLAIMar 8

Learning-free L2-Accented Speech Generation using Phonological Rules

arXiv:2603.07550v1
Predicted impact top 72% in CL · last 90 daysOriginality Incremental advance
AI Analysis

This work addresses the problem of generating accented speech for speech technology developers who lack large accented datasets or require fine-grained phoneme-level control.

This paper proposes a learning-free framework for generating L2-accented speech by applying phonological rules to phoneme sequences, combined with a multilingual TTS model. The method successfully shifts accent for Spanish- and Indian-accented English without requiring accented training data, while maintaining speech quality.

Accent plays a crucial role in speaker identity and inclusivity in speech technologies. Existing accented text-to-speech (TTS) systems either require large-scale accented datasets or lack fine-grained phoneme-level controllability. We propose a accented TTS framework that combines phonological rules with a multilingual TTS model. The rules are applied to phoneme sequences to transform accent at the phoneme level while preserving intelligibility. The method requires no accented training data and enables explicit phoneme-level accent manipulation. We design rule sets for Spanish- and Indian-accented English, modeling systematic differences in consonants, vowels, and syllable structure arising from phonotactic constraints. We analyze the trade-off between phoneme-level duration alignment and accent as realized in speech timing. Experimental results demonstrate effective accent shift while maintaining speech quality.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes