CLAISIApr 23, 2025

Debunking with Dialogue? Exploring AI-Generated Counterspeech to Challenge Conspiracy Theories

arXiv:2504.16604v23 citationsh-index: 1
Originality Synthesis-oriented
AI Analysis

This addresses the challenge of automating counterspeech for conspiracy theories, but it is incremental as it highlights limitations rather than providing a solution.

The study tackled the problem of scaling counterspeech against conspiracy theories online by evaluating AI-generated responses from models like GPT-4o, finding that they often produced generic, repetitive, or factually inaccurate results, with issues such as over-acknowledging fear and hallucinating facts.

Counterspeech is a key strategy against harmful online content, but scaling expert-driven efforts is challenging. Large Language Models (LLMs) present a potential solution, though their use in countering conspiracy theories is under-researched. Unlike for hate speech, no datasets exist that pair conspiracy theory comments with expert-crafted counterspeech. We address this gap by evaluating the ability of GPT-4o, Llama 3, and Mistral to effectively apply counterspeech strategies derived from psychological research provided through structured prompts. Our results show that the models often generate generic, repetitive, or superficial results. Additionally, they over-acknowledge fear and frequently hallucinate facts, sources, or figures, making their prompt-based use in practical applications problematic.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes