AlbMoRe: A Corpus of Movie Reviews for Sentiment Analysis in Albanian
This provides a dataset for sentiment analysis research in Albanian, addressing a gap for low-resource languages, but it is incremental as it applies existing methods to new data.
The authors tackled the lack of resources for low-resource languages by creating AlbMoRe, a corpus of 800 sentiment-annotated movie reviews in Albanian, and reported preliminary results using traditional machine learning classifiers as baselines.
Lack of available resources such as text corpora for low-resource languages seriously hinders research on natural language processing and computational linguistics. This paper presents AlbMoRe, a corpus of 800 sentiment annotated movie reviews in Albanian. Each text is labeled as positive or negative and can be used for sentiment analysis research. Preliminary results based on traditional machine learning classifiers trained with the AlbMoRe samples are also reported. They can serve as comparison baselines for future research experiments.