Emotion Detection in Reddit: Comparative Study of Machine Learning and Deep Learning Techniques
It addresses emotion detection for text analysis applications, but is incremental as it compares existing methods on a known dataset.
This study tackled emotion detection in Reddit comments by comparing machine learning and deep learning models, finding that a Stacking classifier outperformed others, including EmoBERTa, in accuracy and performance, and deployed it in a web application for real-world use.
Emotion detection is pivotal in human communication, as it significantly influences behavior, relationships, and decision-making processes. This study concentrates on text-based emotion detection by leveraging the GoEmotions dataset, which annotates Reddit comments with 27 distinct emotions. These emotions are subsequently mapped to Ekman's six basic categories: joy, anger, fear, sadness, disgust, and surprise. We employed a range of models for this task, including six machine learning models, three ensemble models, and a Long Short-Term Memory (LSTM) model to determine the optimal model for emotion detection. Results indicate that the Stacking classifier outperforms other models in accuracy and performance. We also benchmark our models against EmoBERTa, a pre-trained emotion detection model, with our Stacking classifier proving more effective. Finally, the Stacking classifier is deployed via a Streamlit web application, underscoring its potential for real-world applications in text-based emotion analysis.