IRAIApr 25, 2023

Introducing MBIB -- the first Media Bias Identification Benchmark Task and Dataset Collection

arXiv:2304.13148v133 citationsh-index: 43
Originality Synthesis-oriented
AI Analysis

This addresses the problem of fragmented evaluation in media bias detection for researchers and practitioners, though it is incremental as it builds on existing datasets and methods.

The authors tackled the lack of a unified benchmark for media bias detection by introducing MBIB, a comprehensive benchmark grouping nine tasks and 22 datasets, and found that while models perform well on some bias types like hate speech, they struggle with others such as cognitive and political bias, with no single technique significantly outperforming others.

Although media bias detection is a complex multi-task problem, there is, to date, no unified benchmark grouping these evaluation tasks. We introduce the Media Bias Identification Benchmark (MBIB), a comprehensive benchmark that groups different types of media bias (e.g., linguistic, cognitive, political) under a common framework to test how prospective detection techniques generalize. After reviewing 115 datasets, we select nine tasks and carefully propose 22 associated datasets for evaluating media bias detection techniques. We evaluate MBIB using state-of-the-art Transformer techniques (e.g., T5, BART). Our results suggest that while hate speech, racial bias, and gender bias are easier to detect, models struggle to handle certain bias types, e.g., cognitive and political bias. However, our results show that no single technique can outperform all the others significantly. We also find an uneven distribution of research interest and resource allocation to the individual tasks in media bias. A unified benchmark encourages the development of more robust systems and shifts the current paradigm in media bias detection evaluation towards solutions that tackle not one but multiple media bias types simultaneously.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes