CL CY SIOct 27, 2023

Lost in translation: using global fact-checks to measure multilingual misinformation prevalence, spread, and evolution

Dorian Quelle, Calvin Cheng, Alexandre Bovet, Scott A. Hale

arXiv:2310.18089v23.35 citationsh-index: 17

Originality Highly original

AI Analysis

This addresses the challenge of misinformation spreading across languages for fact-checkers and policymakers, providing novel quantitative insights into its dynamics.

The paper tackled the problem of measuring multilingual misinformation prevalence and cross-language diffusion by analyzing 264,487 fact-checks across 95 languages, finding that 10.26% of claims are checked multiple times and 32.26% of repeated claims cross linguistic barriers.

Misinformation and disinformation are growing threats in the digital age, affecting people across languages and borders. However, no research has investigated the prevalence of multilingual misinformation and quantified the extent to which misinformation diffuses across languages. This paper investigates the prevalence and dynamics of multilingual misinformation through an analysis of 264,487 fact-checks spanning 95 languages. To study the evolution of claims over time and mutations across languages, we represent fact-checks with multilingual sentence embeddings and build a graph where semantically similar claims are linked. We provide quantitative evidence of repeated fact-checking efforts and establish that claims diffuse across languages. Specifically, we find that while the majority of misinformation claims are only fact-checked once, 10.26%, corresponding to more than 27,000 claims, are checked multiple times. Using fact-checks as a proxy for the spread of misinformation, we find 32.26% of repeated claims cross linguistic boundaries, suggesting that some misinformation permeates language barriers. However, spreading patterns exhibit strong assortativity, with misinformation more likely to spread within the same language or language family. Next we show that fact-checkers take more time to fact-check claims that have crossed language barriers and model the temporal and cross-lingual evolution of claims. We analyze connected components and shortest paths connecting different versions of a claim finding that claims gradually drift over time and undergo greater alteration when traversing languages. Misinformation changes over time, reducing the effectiveness of static claim matching algorithms. The findings advocate for expanded information sharing between fact-checkers globally while underscoring the importance of localized verification.

View on arXiv PDF

Similar