DeepEMO: Deep Learning for Speech Emotion Recognition
This work addresses speech emotion recognition for industry applications, but it appears incremental as it builds on existing deep transfer learning methods without specifying major breakthroughs.
The authors tackled the problem of speech emotion recognition in industry settings by proposing DeepEMO, a deep learning framework that uses preprocessing for feature extraction and deep transfer learning, achieving real results despite limited training data and high costs.
We proposed the industry level deep learning approach for speech emotion recognition task. In industry, carefully proposed deep transfer learning technology shows real results due to mostly low amount of training data availability, machine training cost, and specialized learning on dedicated AI tasks. The proposed speech recognition framework, called DeepEMO, consists of two main pipelines such that preprocessing to extract efficient main features and deep transfer learning model to train and recognize. Main source code is in https://github.com/enkhtogtokh/deepemo repository