Language, Culture, and Ideology: Personalizing Offensiveness Detection in Political Tweets with Reasoning LLMs
This work addresses the challenge of personalizing offensiveness detection in political discourse across languages and ideologies, which is incremental as it builds on existing datasets and models to improve nuance in sociopolitical text classification.
The study tackled the problem of assessing offensiveness in political tweets by prompting large language models (LLMs) to adopt specific political and cultural perspectives, finding that larger models with reasoning abilities, such as DeepSeek-R1 and o4-mini, were more consistent and sensitive to ideological and cultural variations compared to smaller models.
We explore how large language models (LLMs) assess offensiveness in political discourse when prompted to adopt specific political and cultural perspectives. Using a multilingual subset of the MD-Agreement dataset centered on tweets from the 2020 US elections, we evaluate several recent LLMs - including DeepSeek-R1, o4-mini, GPT-4.1-mini, Qwen3, Gemma, and Mistral - tasked with judging tweets as offensive or non-offensive from the viewpoints of varied political personas (far-right, conservative, centrist, progressive) across English, Polish, and Russian contexts. Our results show that larger models with explicit reasoning abilities (e.g., DeepSeek-R1, o4-mini) are more consistent and sensitive to ideological and cultural variation, while smaller models often fail to capture subtle distinctions. We find that reasoning capabilities significantly improve both the personalization and interpretability of offensiveness judgments, suggesting that such mechanisms are key to adapting LLMs for nuanced sociopolitical text classification across languages and ideologies.