KEN: Knowledge Augmentation and Emotion Guidance Network for Multimodal Fake News Detection
This work addresses the problem of misinformation spread on social media for users and platforms, but it is incremental as it builds on existing multimodal detection methods with specific enhancements.
The paper tackles multimodal fake news detection by addressing inadequate image semantics and limited textual information, proposing a Knowledge Augmentation and Emotion Guidance Network (KEN) that leverages LVLM for semantic understanding and balances emotional types, achieving superior performance on two real-world datasets.
In recent years, the rampant spread of misinformation on social media has made accurate detection of multimodal fake news a critical research focus. However, previous research has not adequately understood the semantics of images, and models struggle to discern news authenticity with limited textual information. Meanwhile, treating all emotional types of news uniformly without tailored approaches further leads to performance degradation. Therefore, we propose a novel Knowledge Augmentation and Emotion Guidance Network (KEN). On the one hand, we effectively leverage LVLM's powerful semantic understanding and extensive world knowledge. For images, the generated captions provide a comprehensive understanding of image content and scenes, while for text, the retrieved evidence helps break the information silos caused by the closed and limited text and context. On the other hand, we consider inter-class differences between different emotional types of news through balanced learning, achieving fine-grained modeling of the relationship between emotional types and authenticity. Extensive experiments on two real-world datasets demonstrate the superiority of our KEN.