CLAug 8, 2024

MemeMind at ArAIEval Shared Task: Spotting Persuasive Spans in Arabic Text with Persuasion Techniques Identification

Md Rafiul Biswas, Zubair Shah, Wajdi Zaghouani

arXiv:2408.04540v114.428 citationsh-index: 30Has Code

Originality Synthesis-oriented

AI Analysis

This work addresses the problem of identifying propaganda in Arabic text for social media and news analysis, but it is incremental as it applies a standard fine-tuning approach to a pre-trained model.

The paper tackled detecting propagandistic spans and persuasion techniques in Arabic text from tweets and news paragraphs, achieving an F1 score of 0.2774 and securing 3rd place in a shared task leaderboard.

This paper focuses on detecting propagandistic spans and persuasion techniques in Arabic text from tweets and news paragraphs. Each entry in the dataset contains a text sample and corresponding labels that indicate the start and end positions of propaganda techniques within the text. Tokens falling within a labeled span were assigned "B" (Begin) or "I" (Inside), "O", corresponding to the specific propaganda technique. Using attention masks, we created uniform lengths for each span and assigned BIO tags to each token based on the provided labels. Then, we used AraBERT-base pre-trained model for Arabic text tokenization and embeddings with a token classification layer to identify propaganda techniques. Our training process involves a two-phase fine-tuning approach. First, we train only the classification layer for a few epochs, followed by full model fine-tuning, updating all parameters. This methodology allows the model to adapt to the specific characteristics of the propaganda detection task while leveraging the knowledge captured by the pre-trained AraBERT model. Our approach achieved an F1 score of 0.2774, securing the 3rd position in the leaderboard of Task 1.

View on arXiv PDF Code

Similar