RADAR: Retrieval-Augmented Detector with Adversarial Refinement for Robust Fake News Detection
This addresses the spread of misinformation for online platforms and media, though it is incremental as it builds on existing retrieval and adversarial methods.
The paper tackles the problem of detecting LLM-generated fake news by proposing RADAR, a retrieval-augmented detector with adversarial refinement, which achieves 86.98% ROC-AUC on a benchmark, significantly outperforming general-purpose LLMs with retrieval.
To efficiently combat the spread of LLM-generated misinformation, we present RADAR, a retrieval-augmented detector with adversarial refinement for robust fake news detection. Our approach employs a generator that rewrites real articles with factual perturbations, paired with a lightweight detector that verifies claims using dense passage retrieval. To enable effective co-evolution, we introduce verbal adversarial feedback (VAF). Rather than relying on scalar rewards, VAF issues structured natural-language critiques; these guide the generator toward more sophisticated evasion attempts, compelling the detector to adapt and improve. On a fake news detection benchmark, RADAR achieves 86.98% ROC-AUC, significantly outperforming general-purpose LLMs with retrieval. Ablation studies confirm that detector-side retrieval yields the largest gains, while VAF and few-shot demonstrations provide critical signals for robust training.