SDJul 1, 2017

Studying and Enhancing Talking Condition Recognition in Stressful and Emotional Talking Environments Based on HMMs, CHMM2s and SPHMMs

arXiv:1707.00680v111.825 citations

Originality Synthesis-oriented

AI Analysis

This work addresses speech recognition challenges in noisy or emotional settings, but it is incremental as it compares existing model variants.

The research tackled the problem of recognizing talking conditions in stressful and emotional environments using HMMs, CHMM2s, and SPHMMs, finding that SPHMMs outperformed the others and stressful environments were recognized better than emotional ones by up to 3.3%.

The work of this research is devoted to studying and enhancing talking condition recognition in stressful and emotional talking environments (completely two separate environments) based on three different and separate classifiers. The three classifiers are: Hidden Markov Models (HMMs), Second-Order Circular Hidden Markov Models (CHMM2s) and Suprasegmental Hidden Markov Models (SPHMMs). The stressful talking environments that have been used in this work are composed of neutral, shouted, slow, loud, soft and fast talking conditions, while the emotional talking environments are made up of neutral, angry, sad, happy, disgust and fear emotions. The achieved results in the current work show that SPHMMs lead each of HMMs and CHMM2s in improving talking condition recognition in stressful and emotional talking environments. The results also demonstrate that talking condition recognition in stressful talking environments outperforms that in emotional talking environments by 2.7%, 1.8% and 3.3% based on HMMs, CHMM2s and SPHMMs, respectively. Based on subjective assessment by human judges, the recognition performance of stressful talking conditions leads that of emotional ones by 5.2%.

View on arXiv PDF

Similar