CLOct 13, 2021

Systematic Inequalities in Language Technology Performance across the World's Languages

arXiv:2110.06733v1652 citations
Originality Synthesis-oriented
AI Analysis

This addresses the problem of inequality in language technology access and performance for speakers of underrepresented languages, highlighting a critical societal issue.

The study quantified systematic performance disparities in natural language processing (NLP) technologies across the world's languages, revealing that progress is restricted to a minuscule subset of languages, and provided recommendations for more equitable development.

Natural language processing (NLP) systems have become a central technology in communication, education, medicine, artificial intelligence, and many other domains of research and development. While the performance of NLP methods has grown enormously over the last decade, this progress has been restricted to a minuscule subset of the world's 6,500 languages. We introduce a framework for estimating the global utility of language technologies as revealed in a comprehensive snapshot of recent publications in NLP. Our analyses involve the field at large, but also more in-depth studies on both user-facing technologies (machine translation, language understanding, question answering, text-to-speech synthesis) as well as more linguistic NLP tasks (dependency parsing, morphological inflection). In the process, we (1) quantify disparities in the current state of NLP research, (2) explore some of its associated societal and academic factors, and (3) produce tailored recommendations for evidence-based policy making aimed at promoting more global and equitable language technologies.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes