CLLGNov 1, 2020

ASAD: A Twitter-based Benchmark Arabic Sentiment Analysis Dataset

arXiv:2011.00578v337 citations
Originality Synthesis-oriented
AI Analysis

This provides a valuable resource for researchers and practitioners in Arabic NLP, though it is incremental as it builds on existing dataset efforts.

The authors tackled the lack of large, high-quality datasets for Arabic sentiment analysis by creating ASAD, a Twitter-based benchmark dataset with 95K tweets annotated into three sentiment classes, and they implemented baseline models to provide reference results for a competition.

This paper provides a detailed description of a new Twitter-based benchmark dataset for Arabic Sentiment Analysis (ASAD), which is launched in a competition3, sponsored by KAUST for awarding 10000 USD, 5000 USD and 2000 USD to the first, second and third place winners, respectively. Compared to other publicly released Arabic datasets, ASAD is a large, high-quality annotated dataset(including 95K tweets), with three-class sentiment labels (positive, negative and neutral). We presents the details of the data collection process and annotation process. In addition, we implement several baseline models for the competition task and report the results as a reference for the participants to the competition.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes