Ensemble Language Models for Multilingual Sentiment Analysis
This work addresses sentiment analysis for low-resource languages, but it is incremental as it applies existing ensemble methods to new datasets.
The study tackled sentiment analysis for low-resource languages like Arabic using tweet datasets, finding that monolingual models performed best and ensemble models outperformed baselines, with majority voting ensemble surpassing English language performance.
The rapid advancement of social media enables us to analyze user opinions. In recent times, sentiment analysis has shown a prominent research gap in understanding human sentiment based on the content shared on social media. Although sentiment analysis for commonly spoken languages has advanced significantly, low-resource languages like Arabic continue to get little research due to resource limitations. In this study, we explore sentiment analysis on tweet texts from SemEval-17 and the Arabic Sentiment Tweet dataset. Moreover, We investigated four pretrained language models and proposed two ensemble language models. Our findings include monolingual models exhibiting superior performance and ensemble models outperforming the baseline while the majority voting ensemble outperforms the English language.