CL LGJan 8, 2021

Effect of Word Embedding Variable Parameters on Arabic Sentiment Analysis Performance

arXiv:2101.02906v10.28 citations

Originality Synthesis-oriented

AI Analysis

This research addresses the challenge of optimizing word embedding parameters for Arabic sentiment analysis, which is crucial for researchers and practitioners working with morphologically rich languages.

This study investigates the impact of word embedding parameters (window size, vector dimension, negative samples) on Arabic sentiment analysis performance using DBOW and DMPV architectures. It evaluates four binary classifiers (Logistic Regression, Decision Tree, Support Vector Machine, Naive Bayes) based on Precision, Recall, and F1-score.

Social media such as Twitter, Facebook, etc. has led to a generated growing number of comments that contains users opinions. Sentiment analysis research deals with these comments to extract opinions which are positive or negative. Arabic language is a rich morphological language; thus, classical techniques of English sentiment analysis cannot be used for Arabic. Word embedding technique can be considered as one of successful methods to gaping the morphological problem of Arabic. Many works have been done for Arabic sentiment analysis based on word embedding, but there is no study focused on variable parameters. This study will discuss three parameters (Window size, Dimension of vector and Negative Sample) for Arabic sentiment analysis using DBOW and DMPV architectures. A large corpus of previous works generated to learn word representations and extract features. Four binary classifiers (Logistic Regression, Decision Tree, Support Vector Machine and Naive Bayes) are used to detect sentiment. The performance of classifiers evaluated based on; Precision, Recall and F1-score.

View on arXiv PDF

Similar