CL AIMay 4

mdok-style at SemEval-2026 Task 9: Finetuning LLMs for Multilingual Polarization Detection

Dominik Macko, Alok Debnath, Jakub Simko

arXiv:2605.0269533.0

AI Analysis

This work addresses the need for automated detection of online polarization to prevent escalation into hate speech, targeting social media platforms and online communities.

The authors finetuned mid-size LLMs with QLoRA for multilingual polarization detection across 22 languages, achieving robust performance by augmenting training data with anonymized, lower-cased, upper-cased, and homoglyphied variants.

SemEval-2026 Task 9 is focused on multilingual polarization detection. Specifically, it covers the identification of multilingual, multicultural and multievent polarization along three axes (in subtasks), namely detection, type, and manifestation. Online polarization presents a concern, because it is often followed by hate speech, offensive discourse, and social fragmentation. Therefore, its detection before it escalates is crucial for a safer and more inclusive online space. We have coped with this SemEval task by finetuning mid-size LLMs for the sequence-classification task using the QLoRA parameter-efficient finetuning technique. The training data augmented the multilingual (22 languages) training sets by anonymized, lower-cased, upper-cased, and homoglyphied counterparts, making the detection more robust.

View on arXiv PDF

Similar