CLNov 11, 2019

Understanding BERT performance in propaganda analysis

arXiv:1911.04525v11000 citations

Originality Synthesis-oriented

AI Analysis

This work addresses propaganda detection for NLP researchers, but it is incremental as it applies an existing method to a new dataset with limited improvements.

The authors tackled the problem of fine-grained propaganda analysis at the sentence level using a pretrained BERT model, which achieved an F1 score of 0.62 and ranked third among 25 teams in a shared task, but they identified issues such as misclassifying opinion pieces as propaganda and failing to distinguish quotations from actual propaganda usage.

In this paper, we describe our system used in the shared task for fine-grained propaganda analysis at sentence level. Despite the challenging nature of the task, our pretrained BERT model (team YMJA) fine tuned on the training dataset provided by the shared task scored 0.62 F1 on the test set and ranked third among 25 teams who participated in the contest. We present a set of illustrative experiments to better understand the performance of our BERT model on this shared task. Further, we explore beyond the given dataset for false-positive cases that likely to be produced by our system. We show that despite the high performance on the given testset, our system may have the tendency of classifying opinion pieces as propaganda and cannot distinguish quotations of propaganda speech from actual usage of propaganda techniques.

View on arXiv PDF

Similar