CLIRLGSep 27, 2021

Evaluation of Non-Negative Matrix Factorization and n-stage Latent Dirichlet Allocation for Emotion Analysis in Turkish Tweets

arXiv:2110.00418v13 citations
Originality Synthesis-oriented
AI Analysis

This work addresses emotion detection in Turkish social media data, which is incremental as it applies existing methods to a specific language and dataset.

The study compared Non-Negative Matrix Factorization (NMF) and n-stage Latent Dirichlet Allocation (LDA) for emotion analysis in Turkish tweets, finding NMF as the most successful topic modeling method and Random Forest as the top classifier with n-stage LDA.

With the development of technology, the use of social media has become quite common. Analyzing comments on social media in areas such as media and advertising plays an important role today. For this reason, new and traditional natural language processing methods are used to detect the emotion of these shares. In this paper, the Latent Dirichlet Allocation, namely LDA, and Non-Negative Matrix Factorization methods in topic modeling were used to determine which emotion the Turkish tweets posted via Twitter. In addition, the accuracy of a proposed n-level method based on LDA was analyzed. Dataset consists of 5 emotions, namely angry, fear, happy, sad and confused. NMF was the most successful method among all topic modeling methods in this study. Then, the F1-measure of Random Forest, Naive Bayes and Support Vector Machine methods was analyzed by obtaining a file suitable for Weka by using the word weights and class labels of the topics. Among the Weka results, the most successful method was n-stage LDA, and the most successful algorithm was Random Forest.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes