Debunk and Infer: Multimodal Fake News Detection via Diffusion-Generated Evidence and LLM Reasoning
This addresses the problem of fake news spread for information credibility, with incremental improvements in performance and interpretability.
The paper tackles fake news detection in multimedia by proposing a Debunk-and-Infer framework that integrates diffusion models to generate evidence and large language models for reasoning, achieving improved detection accuracy as shown in experiments on FakeSV and FVC datasets.
The rapid spread of fake news across multimedia platforms presents serious challenges to information credibility. In this paper, we propose a Debunk-and-Infer framework for Fake News Detection(DIFND) that leverages debunking knowledge to enhance both the performance and interpretability of fake news detection. DIFND integrates the generative strength of conditional diffusion models with the collaborative reasoning capabilities of multimodal large language models (MLLMs). Specifically, debunk diffusion is employed to generate refuting or authenticating evidence based on the multimodal content of news videos, enriching the evaluation process with diverse yet semantically aligned synthetic samples. To improve inference, we propose a chain-of-debunk strategy where a multi-agent MLLM system produces logic-grounded, multimodal-aware reasoning content and final veracity judgment. By jointly modeling multimodal features, generative debunking cues, and reasoning-rich verification within a unified architecture, DIFND achieves notable improvements in detection accuracy. Extensive experiments on the FakeSV and FVC datasets show that DIFND not only outperforms existing approaches but also delivers trustworthy decisions.