Large Language Models for Propaganda Detection
This addresses the problem of detecting subtle propaganda in news articles for NLP applications, but it is incremental as it applies existing LLMs to a known dataset.
The paper tackles propaganda detection in text by evaluating Large Language Models (GPT-3 and GPT-4) on the SemEval-2020 dataset, finding that GPT-4 achieves comparable performance to the state-of-the-art RoBERTa model.
The prevalence of propaganda in our digital society poses a challenge to societal harmony and the dissemination of truth. Detecting propaganda through NLP in text is challenging due to subtle manipulation techniques and contextual dependencies. To address this issue, we investigate the effectiveness of modern Large Language Models (LLMs) such as GPT-3 and GPT-4 for propaganda detection. We conduct experiments using the SemEval-2020 task 11 dataset, which features news articles labeled with 14 propaganda techniques as a multi-label classification problem. Five variations of GPT-3 and GPT-4 are employed, incorporating various prompt engineering and fine-tuning strategies across the different models. We evaluate the models' performance by assessing metrics such as $F1$ score, $Precision$, and $Recall$, comparing the results with the current state-of-the-art approach using RoBERTa. Our findings demonstrate that GPT-4 achieves comparable results to the current state-of-the-art. Further, this study analyzes the potential and challenges of LLMs in complex tasks like propaganda detection.