AICLCVSep 16, 2020

Detecting Cross-Modal Inconsistency to Defend Against Neural Fake News

arXiv:2009.07698v51004 citations
AI Analysis

This addresses the societal problem of disinformation for the general population, offering a more realistic defense mechanism beyond text-only approaches, though it is incremental in extending existing methods to multimodal settings.

The paper tackles the problem of defending against machine-generated fake news that includes images and captions, introducing a NeuralNews dataset and a method based on detecting visual-semantic inconsistencies as an effective first line of defense.

Large-scale dissemination of disinformation online intended to mislead or deceive the general population is a major societal problem. Rapid progression in image, video, and natural language generative models has only exacerbated this situation and intensified our need for an effective defense mechanism. While existing approaches have been proposed to defend against neural fake news, they are generally constrained to the very limited setting where articles only have text and metadata such as the title and authors. In this paper, we introduce the more realistic and challenging task of defending against machine-generated news that also includes images and captions. To identify the possible weaknesses that adversaries can exploit, we create a NeuralNews dataset composed of 4 different types of generated articles as well as conduct a series of human user study experiments based on this dataset. In addition to the valuable insights gleaned from our user study experiments, we provide a relatively effective approach based on detecting visual-semantic inconsistencies, which will serve as an effective first line of defense and a useful reference for future work in defending against machine-generated disinformation.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes