Identifying Speakers Using Their Emotion Cues
This work addresses speaker identification for audio processing applications, but it is incremental as it builds on existing methods like HMMs and SPHMMs.
The paper tackles speaker identification by incorporating emotional cues, achieving an average performance of 79.92% with a two-stage recognizer, which is a significant improvement over a one-stage recognizer at 71.58%.
This paper addresses the formulation of a new speaker identification approach which employs knowledge of emotional content of speaker information. Our proposed approach in this work is based on a two-stage recognizer that combines and integrates both emotion recognizer and speaker recognizer into one recognizer. The proposed approach employs both Hidden Markov Models (HMMs) and Suprasegmental Hidden Markov Models (SPHMMs) as classifiers. In the experiments, six emotions are considered including neutral, angry, sad, happy, disgust and fear. Our results show that average speaker identification performance based on the proposed two-stage recognizer is 79.92% with a significant improvement over a one-stage recognizer with an identification performance of 71.58%. The results obtained based on the proposed approach are close to those achieved in subjective evaluation by human listeners.