YASO: A Targeted Sentiment Analysis Evaluation Dataset for Open-Domain Reviews
This dataset addresses the limitation of existing cross-domain TSA evaluation, which is restricted to a small number of review domains, by providing a more diverse and realistic benchmark for researchers and developers working on sentiment analysis for platforms like Amazon or Yelp.
The authors created YASO, a new dataset for targeted sentiment analysis (TSA) evaluation, comprising 2,215 English sentences from various review domains. Benchmarking five contemporary TSA systems on YASO revealed significant room for improvement, indicating the challenging nature of this open-domain dataset.
Current TSA evaluation in a cross-domain setup is restricted to the small set of review domains available in existing datasets. Such an evaluation is limited, and may not reflect true performance on sites like Amazon or Yelp that host diverse reviews from many domains. To address this gap, we present YASO - a new TSA evaluation dataset of open-domain user reviews. YASO contains 2,215 English sentences from dozens of review domains, annotated with target terms and their sentiment. Our analysis verifies the reliability of these annotations, and explores the characteristics of the collected data. Benchmark results using five contemporary TSA systems show there is ample room for improvement on this challenging new dataset. YASO is available at https://github.com/IBM/yaso-tsa.