CLCYIRLGApr 26, 2022

Monant Medical Misinformation Dataset: Mapping Articles to Fact-Checked Claims

arXiv:2204.12294v126 citationsh-index: 32
Originality Synthesis-oriented
AI Analysis

This dataset helps researchers address medical misinformation, particularly during COVID-19, but it is incremental as it builds on existing fact-checking resources.

The authors tackled the problem of medical misinformation by creating a dataset of 317k medical articles and 3.5k fact-checked claims, with mappings for claim presence and stance, and provided baselines evaluated on manually labeled data.

False information has a significant negative influence on individuals as well as on the whole society. Especially in the current COVID-19 era, we witness an unprecedented growth of medical misinformation. To help tackle this problem with machine learning approaches, we are publishing a feature-rich dataset of approx. 317k medical news articles/blogs and 3.5k fact-checked claims. It also contains 573 manually and more than 51k automatically labelled mappings between claims and articles. Mappings consist of claim presence, i.e., whether a claim is contained in a given article, and article stance towards the claim. We provide several baselines for these two tasks and evaluate them on the manually labelled part of the dataset. The dataset enables a number of additional tasks related to medical misinformation, such as misinformation characterisation studies or studies of misinformation diffusion between sources.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes