CyberWallE at SemEval-2020 Task 11: An Analysis of Feature Engineering for Ensemble Models for Propaganda Detection
This work addresses propaganda detection for NLP researchers, but it is incremental as it applies existing methods like bi-LSTM and ensemble models with feature engineering to a specific competition.
The paper tackled the problem of detecting propaganda techniques in news articles by participating in SemEval-2020 tasks, achieving an F1-score of 43.86% for span identification and 57.37% for technique classification, ranking 8th in both subtasks.
This paper describes our participation in the SemEval-2020 task Detection of Propaganda Techniques in News Articles. We participate in both subtasks: Span Identification (SI) and Technique Classification (TC). We use a bi-LSTM architecture in the SI subtask and train a complex ensemble model for the TC subtask. Our architectures are built using embeddings from BERT in combination with additional lexical features and extensive label post-processing. Our systems achieve a rank of 8 out of 35 teams in the SI subtask (F1-score: 43.86%) and 8 out of 31 teams in the TC subtask (F1-score: 57.37%).