CL AI HCMar 5, 2021

Transfer Learning based Speech Affect Recognition in Urdu

arXiv:2103.03580v10.2

Originality Synthesis-oriented

AI Analysis

This addresses data scarcity in speech affect recognition for low-resource languages like Urdu, though it is incremental as it applies existing transfer learning methods to a new language.

The authors tackled speech affect recognition for low-resource Urdu by using transfer learning from high-resource languages, achieving 74.7% UAR on RAVDESS-to-Urdu transfer and demonstrating effectiveness with only 400 Urdu utterances.

It has been established that Speech Affect Recognition for low resource languages is a difficult task. Here we present a Transfer learning based Speech Affect Recognition approach in which: we pre-train a model for high resource language affect recognition task and fine tune the parameters for low resource language using Deep Residual Network. Here we use standard four data sets to demonstrate that transfer learning can solve the problem of data scarcity for Affect Recognition task. We demonstrate that our approach is efficient by achieving 74.7 percent UAR on RAVDESS as source and Urdu data set as a target. Through an ablation study, we have identified that pre-trained model adds most of the features information, improvement in results and solves less data issues. Using this knowledge, we have also experimented on SAVEE and EMO-DB data set by setting Urdu as target language where only 400 utterances of data is available. This approach achieves high Unweighted Average Recall (UAR) when compared with existing algorithms.

View on arXiv PDF

Similar