SICLCYNov 7, 2024

A Guide to Misinformation Detection Data and Evaluation

arXiv:2411.05060v59 citationsh-index: 15KDD
Originality Synthesis-oriented
AI Analysis

This work addresses data quality and evaluation challenges for researchers in misinformation detection, though it is incremental as it builds on existing datasets and methods.

The authors tackled the problem of data deficiencies in misinformation detection by curating the largest collection of 75 datasets and evaluating the quality of 45 datasets, finding that most have severe issues like ambiguous examples that hinder accurate model evaluation.

Misinformation is a complex societal issue, and mitigating solutions are difficult to create due to data deficiencies. To address this, we have curated the largest collection of (mis)information datasets in the literature, totaling 75. From these, we evaluated the quality of 36 datasets that consist of statements or claims, as well as the 9 datasets that consist of data in purely paragraph form. We assess these datasets to identify those with solid foundations for empirical work and those with flaws that could result in misleading and non-generalizable results, such as spurious correlations, or examples that are ambiguous or otherwise impossible to assess for veracity. We find the latter issue is particularly severe and affects most datasets in the literature. We further provide state-of-the-art baselines on all these datasets, but show that regardless of label quality, categorical labels may no longer give an accurate evaluation of detection model performance. Finally, we propose and highlight Evaluation Quality Assurance (EQA) as a tool to guide the field toward systemic solutions rather than inadvertently propagating issues in evaluation. Overall, this guide aims to provide a roadmap for higher quality data and better grounded evaluations, ultimately improving research in misinformation detection. All datasets and other artifacts are available at https://misinfo-datasets.complexdatalab.com/.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes