CLAug 13, 2024

Multilingual Models for Check-Worthy Social Media Posts Detection

arXiv:2408.06737v12 citationsh-index: 1
Originality Synthesis-oriented
AI Analysis

This addresses the problem of identifying check-worthy content for fact-checkers and moderators across multiple languages, including low-resource ones, but is incremental in nature.

The study tackled detecting verifiable factual and harmful claims in social media posts by developing multilingual transformer-based models, achieving results validated as robust against state-of-the-art models.

This work presents an extensive study of transformer-based NLP models for detection of social media posts that contain verifiable factual claims and harmful claims. The study covers various activities, including dataset collection, dataset pre-processing, architecture selection, setup of settings, model training (fine-tuning), model testing, and implementation. The study includes a comprehensive analysis of different models, with a special focus on multilingual models where the same model is capable of processing social media posts in both English and in low-resource languages such as Arabic, Bulgarian, Dutch, Polish, Czech, Slovak. The results obtained from the study were validated against state-of-the-art models, and the comparison demonstrated the robustness of the proposed models. The novelty of this work lies in the development of multi-label multilingual classification models that can simultaneously detect harmful posts and posts that contain verifiable factual claims in an efficient way.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes