CLApr 1, 2024

A Study on Scaling Up Multilingual News Framing Analysis

arXiv:2404.01481v132 citationsh-index: 7NAACL-HLT
Originality Synthesis-oriented
AI Analysis

It addresses the problem of limited resources for media framing research across diverse languages, though it is incremental in extending existing methods to new contexts.

This study tackled the lack of datasets for multilingual news framing analysis by creating a crowd-sourced dataset across 12 languages, resulting in a 5.32 percentage point performance improvement over the baseline.

Media framing is the study of strategically selecting and presenting specific aspects of political issues to shape public opinion. Despite its relevance to almost all societies around the world, research has been limited due to the lack of available datasets and other resources. This study explores the possibility of dataset creation through crowdsourcing, utilizing non-expert annotators to develop training corpora. We first extend framing analysis beyond English news to a multilingual context (12 typologically diverse languages) through automatic translation. We also present a novel benchmark in Bengali and Portuguese on the immigration and same-sex marriage domains. Additionally, we show that a system trained on our crowd-sourced dataset, combined with other existing ones, leads to a 5.32 percentage point increase from the baseline, showing that crowdsourcing is a viable option. Last, we study the performance of large language models (LLMs) for this task, finding that task-specific fine-tuning is a better approach than employing bigger non-specialized models.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes