CL AI LG SD ASFeb 4, 2025

Developing multilingual speech synthesis system for Ojibwe, Mi'kmaq, and Maliseet

Shenran Wang, Changbing Yang, Mike Parkhill, Chad Quinn, Christopher Hammerly, Jian Zhu

arXiv:2502.02703v117.011 citationsh-index: 2Has CodeNAACL

Originality Incremental advance

AI Analysis

This work addresses the revitalization of low-resource Indigenous languages by providing a technical solution for speech synthesis, though it is incremental as it builds on existing multilingual and attention-free methods.

The researchers tackled the problem of speech synthesis for low-resource Indigenous languages by developing a multilingual TTS system for Ojibwe, Mi'kmaq, and Maliseet, finding that multilingual training improves performance over monolingual models, especially with scarce data, and that attention-free architectures are competitive with self-attention while being more memory-efficient.

We present lightweight flow matching multilingual text-to-speech (TTS) systems for Ojibwe, Mi'kmaq, and Maliseet, three Indigenous languages in North America. Our results show that training a multilingual TTS model on three typologically similar languages can improve the performance over monolingual models, especially when data are scarce. Attention-free architectures are highly competitive with self-attention architecture with higher memory efficiency. Our research not only advances technical development for the revitalization of low-resource languages but also highlights the cultural gap in human evaluation protocols, calling for a more community-centered approach to human evaluation.

View on arXiv PDF Code

Similar