Human Vocal Sentiment Analysis
This work addresses emotion recognition from speech for potential integration into existing systems, but it appears incremental as it combines established techniques without introducing a fundamentally new method.
The paper tackles human vocal sentiment analysis by combining conventional vocal feature extraction with deep learning approaches and context-level analysis, achieving improved emotion-level classification through hyperparameter sweeps and data augmentation, and explores the feasibility of a real-time system.
In this paper, we use several techniques with conventional vocal feature extraction (MFCC, STFT), along with deep-learning approaches such as CNN, and also context-level analysis, by providing the textual data, and combining different approaches for improved emotion-level classification. We explore models that have not been tested to gauge the difference in performance and accuracy. We apply hyperparameter sweeps and data augmentation to improve performance. Finally, we see if a real-time approach is feasible, and can be readily integrated into existing systems.