CLNov 24, 2022

German Phoneme Recognition with Text-to-Phoneme Data Augmentation

arXiv:2211.13776v1h-index: 4
Originality Synthesis-oriented
AI Analysis

This work addresses phoneme recognition for German, but it is incremental as it explores a specific data augmentation tweak with mixed results.

The study investigated the impact of adding frequent phoneme bigrams via text-to-phoneme data augmentation on German phoneme recognition, finding that some models increased BLEU scores by over 1 point while one decreased by more than 20 points, indicating variable effects on performance.

In this study, we experimented to examine the effect of adding the most frequent n phoneme bigrams to the basic vocabulary on the German phoneme recognition model using the text-to-phoneme data augmentation strategy. As a result, compared to the baseline model, the vowel30 model and the const20 model showed an increased BLEU score of more than 1 point, and the total30 model showed a significant decrease in the BLEU score of more than 20 points, showing that the phoneme bigrams could have a positive or negative effect on the model performance. In addition, we identified the types of errors that the models repeatedly showed through error analysis.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes