How to Protect Yourself from 5G Radiation? Investigating LLM Responses to Implicit Misinformation
This addresses a critical safety concern for users of LLMs in diverse real-world scenarios by highlighting a persistent challenge in implicit misinformation, though it is incremental as it builds on existing research on misinformation.
The paper tackled the problem of LLMs tacitly spreading misinformation by creating EchoMist, the first benchmark for implicit misinformation, and found that 15 state-of-the-art LLMs performed alarmingly poorly, often failing to detect false premises and generating counterfactual explanations.
As Large Language Models (LLMs) are widely deployed in diverse scenarios, the extent to which they could tacitly spread misinformation emerges as a critical safety concern. Current research primarily evaluates LLMs on explicit false statements, overlooking how misinformation often manifests subtly as unchallenged premises in real-world interactions. We curated EchoMist, the first comprehensive benchmark for implicit misinformation, where false assumptions are embedded in the query to LLMs. EchoMist targets circulated, harmful, and ever-evolving implicit misinformation from diverse sources, including realistic human-AI conversations and social media interactions. Through extensive empirical studies on 15 state-of-the-art LLMs, we find that current models perform alarmingly poorly on this task, often failing to detect false premises and generating counterfactual explanations. We also investigate two mitigation methods, i.e., Self-Alert and RAG, to enhance LLMs' capability to counter implicit misinformation. Our findings indicate that EchoMist remains a persistent challenge and underscore the critical need to safeguard against the risk of implicit misinformation.