CL AIFeb 19, 2021

Sentiment Analysis for YouTube Comments in Roman Urdu

arXiv:2102.10075v1

AI Analysis

This work addresses sentiment analysis for Roman Urdu speakers in Pakistan, but it is incremental as it applies existing methods to a new dataset without novel methodological contributions.

The paper tackled sentiment analysis for YouTube comments in Roman Urdu, a language with limited prior work, by comparing five supervised learning algorithms and found that SVM achieved the best accuracy of 64%.

Sentiment analysis is a vast area in the Machine learning domain. A lot of work is done on datasets and their analysis of the English Language. In Pakistan, a huge amount of data is in roman Urdu language, it is scattered all over the social sites including Twitter, YouTube, Facebook and similar applications. In this study the focus domain of dataset gathering is YouTube comments. The Dataset contains the comments of people over different Pakistani dramas and TV shows. The Dataset contains multi-class classification that is grouped The comments into positive, negative and neutral sentiment. In this Study comparative analysis is done for five supervised learning Algorithms including linear regression, SVM, KNN, Multi layer Perceptron and Naïve Bayes classifier. Accuracy, recall, precision and F-measure are used for measuring performance. Results show that accuracy of SVM is 64 percent, which is better than the rest of the list.

View on arXiv PDF

Similar