AS CL SDJan 12, 2025

Improving Cross-Lingual Phonetic Representation of Low-Resource Languages Through Language Similarity Analysis

arXiv:2501.06810v15 citationsh-index: 5ICASSP

Originality Incremental advance

AI Analysis

It addresses speech processing challenges for low-resource languages by optimizing source language selection, though it appears incremental as it builds on existing cross-lingual methods.

This paper tackles the problem of improving cross-lingual phonetic representation for low-resource languages by analyzing language similarity for source selection, achieving a 55.6% relative improvement in phoneme recognition over monolingual training.

This paper examines how linguistic similarity affects cross-lingual phonetic representation in speech processing for low-resource languages, emphasizing effective source language selection. Previous cross-lingual research has used various source languages to enhance performance for the target low-resource language without thorough consideration of selection. Our study stands out by providing an in-depth analysis of language selection, supported by a practical approach to assess phonetic proximity among multiple language families. We investigate how within-family similarity impacts performance in multilingual training, which aids in understanding language dynamics. We also evaluate the effect of using phonologically similar languages, regardless of family. For the phoneme recognition task, utilizing phonologically similar languages consistently achieves a relative improvement of 55.6% over monolingual training, even surpassing the performance of a large-scale self-supervised learning model. Multilingual training within the same language family demonstrates that higher phonological similarity enhances performance, while lower similarity results in degraded performance compared to monolingual training.

View on arXiv PDF

Similar