CLMay 14, 2023

CroSentiNews 2.0: A Sentence-Level News Sentiment Corpus

Gaurish Thakkar, Nives Mikelic Preradović, Marko Tadić

arXiv:2305.08187v10.5

Originality Synthesis-oriented

AI Analysis

This work addresses a data gap for sentiment analysis in the Croatian news domain, but it is incremental as it builds on an existing dataset.

The authors tackled the lack of a sentence-level sentiment dataset for Croatian news by creating CroSentiNews 2.0, which includes 14.5K annotated sentences tagged with 5 classes, providing baseline scores and inter-annotator agreement details.

This article presents a sentence-level sentiment dataset for the Croatian news domain. In addition to the 3K annotated texts already present, our dataset contains 14.5K annotated sentence occurrences that have been tagged with 5 classes. We provide baseline scores in addition to the annotation process and inter-annotator agreement.

View on arXiv PDF

Similar