HCDec 13, 2016

Gated Recurrent Unit (GRU) for Emotion Classification from Noisy Speech

arXiv:1612.07778v114.1122 citations

Originality Synthesis-oriented

AI Analysis

This incremental work addresses emotion recognition for smartphone applications in noisy environments.

The authors tackled emotion classification from noisy speech by applying the Gated Recurrent Unit (GRU), a recurrent neural network, and found that it reduces run-time by 18.16% compared to LSTM while maintaining comparable performance.

Despite the enormous interest in emotion classification from speech, the impact of noise on emotion classification is not well understood. This is important because, due to the tremendous advancement of the smartphone technology, it can be a powerful medium for speech emotion recognition in the outside laboratory natural environment, which is likely to incorporate background noise in the speech. We capitalize on the current breakthrough of Recurrent Neural Network (RNN) and seek to investigate its performance for emotion classification from noisy speech. We particularly focus on the recently proposed Gated Recurrent Unit (GRU), which is yet to be explored for emotion recognition from speech. Experiments conducted with speech compounded with eight different types of noises reveal that GRU incurs an 18.16% smaller run-time while performing quite comparably to the Long Short-Term Memory (LSTM), which is the most popular Recurrent Neural Network proposed to date. This result is promising for any embedded platform in general and will initiate further studies to utilize GRU to its full potential for emotion recognition on smartphones.

View on arXiv PDF

Similar