CLCVApr 8, 2023

Factify 2: A Multimodal Fake News and Satire News Dataset

AppleStanford
arXiv:2304.03897v239 citationsh-index: 53Has Code
Originality Synthesis-oriented
AI Analysis

This addresses the need for multimodal fact-checking datasets to combat fake news, but it is incremental as it builds on an existing dataset.

The paper tackles the problem of fake news by introducing FACTIFY 2, a multimodal dataset with 50,000 instances that includes satire articles and improves upon a previous version, with a baseline model achieving 65% F1 score.

The internet gives the world an open platform to express their views and share their stories. While this is very valuable, it makes fake news one of our society's most pressing problems. Manual fact checking process is time consuming, which makes it challenging to disprove misleading assertions before they cause significant harm. This is he driving interest in automatic fact or claim verification. Some of the existing datasets aim to support development of automating fact-checking techniques, however, most of them are text based. Multi-modal fact verification has received relatively scant attention. In this paper, we provide a multi-modal fact-checking dataset called FACTIFY 2, improving Factify 1 by using new data sources and adding satire articles. Factify 2 has 50,000 new data instances. Similar to FACTIFY 1.0, we have three broad categories - support, no-evidence, and refute, with sub-categories based on the entailment of visual and textual data. We also provide a BERT and Vison Transformer based baseline, which achieves 65% F1 score in the test set. The baseline codes and the dataset will be made available at https://github.com/surya1701/Factify-2.0.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes