CL AISep 15, 2023

Fake News Detectors are Biased against Texts Generated by Large Language Models

Jinyan Su, Terry Yue Zhuo, Jonibek Mansurov, Di Wang, Preslav Nakov

arXiv:2309.08674v137 citationsh-index: 47

Originality Highly original

AI Analysis

This addresses the problem of biased fake news detection in the era of LLMs, which is critical for maintaining trust and safety in information systems, though it is incremental as it builds on existing detection methods.

The study found that existing fake news detectors are biased, being more likely to flag LLM-generated content as fake while misclassifying human-written fake news as genuine, and introduced a mitigation strategy that improved detection accuracy for both types.

The spread of fake news has emerged as a critical challenge, undermining trust and posing threats to society. In the era of Large Language Models (LLMs), the capability to generate believable fake content has intensified these concerns. In this study, we present a novel paradigm to evaluate fake news detectors in scenarios involving both human-written and LLM-generated misinformation. Intriguingly, our findings reveal a significant bias in many existing detectors: they are more prone to flagging LLM-generated content as fake news while often misclassifying human-written fake news as genuine. This unexpected bias appears to arise from distinct linguistic patterns inherent to LLM outputs. To address this, we introduce a mitigation strategy that leverages adversarial training with LLM-paraphrased genuine news. The resulting model yielded marked improvements in detection accuracy for both human and LLM-generated news. To further catalyze research in this domain, we release two comprehensive datasets, \texttt{GossipCop++} and \texttt{PolitiFact++}, thus amalgamating human-validated articles with LLM-generated fake and real news.

View on arXiv PDF

Similar