CLOct 24, 2024

Monolingual and Multilingual Misinformation Detection for Low-Resource Languages: A Comprehensive Survey

Xinyu Wang, Wenbo Zhang, Sarah Rajtmajer

arXiv:2410.18390v24.213 citationsh-index: 17

Originality Synthesis-oriented

AI Analysis

It addresses the challenge of misinformation moderation for low-resource language communities, but as a survey, it is incremental in synthesizing existing research rather than proposing new methods.

This survey tackles the problem of misinformation detection in low-resource languages by reviewing existing datasets, methodologies, and tools, highlighting challenges like data scarcity and cultural context, and emphasizing the need for improved systems to handle diverse linguistic settings.

In today's global digital landscape, misinformation transcends linguistic boundaries, posing a significant challenge for moderation systems. Most approaches to misinformation detection are monolingual, focused on high-resource languages, i.e., a handful of world languages that have benefited from substantial research investment. This survey provides a comprehensive overview of the current research on misinformation detection in low-resource languages, both in monolingual and multilingual settings. We review existing datasets, methodologies, and tools used in these domains, identifying key challenges related to: data resources, model development, cultural and linguistic context, and real-world applications. We examine emerging approaches, such as language-generalizable models and multi-modal techniques, and emphasize the need for improved data collection practices, interdisciplinary collaboration, and stronger incentives for socially responsible AI research. Our findings underscore the importance of systems capable of addressing misinformation across diverse linguistic and cultural contexts.

View on arXiv PDF

Similar