CL AIApr 11, 2022

Survey of Aspect-based Sentiment Analysis Datasets

Siva Uday Sampreeth Chebolu, Franck Dernoncourt, Nedim Lipka, Thamar Solorio

arXiv:2204.05232v51.617 citationsh-index: 39Has Code

Originality Synthesis-oriented

AI Analysis

This is an incremental survey that addresses the challenge for researchers in selecting appropriate datasets for ABSA tasks.

The study tackled the problem of scattered corpora for aspect-based sentiment analysis (ABSA) by creating a database of 65 publicly available datasets covering over 25 domains, including 45 English and 20 other languages datasets, to help researchers quickly identify suitable corpora for specific ABSA subtasks.

Aspect-based sentiment analysis (ABSA) is a natural language processing problem that requires analyzing user-generated reviews to determine: a) The target entity being reviewed, b) The high-level aspect to which it belongs, and c) The sentiment expressed toward the targets and the aspects. Numerous yet scattered corpora for ABSA make it difficult for researchers to identify corpora best suited for a specific ABSA subtask quickly. This study aims to present a database of corpora that can be used to train and assess autonomous ABSA systems. Additionally, we provide an overview of the major corpora for ABSA and its subtasks and highlight several features that researchers should consider when selecting a corpus. Finally, we discuss the advantages and disadvantages of current collection approaches and make recommendations for future corpora creation. This survey examines 65 publicly available ABSA datasets covering over 25 domains, including 45 English and 20 other languages datasets.

View on arXiv PDF Code

Similar