Language Varieties of Italy: Technology Challenges and Opportunities
It addresses the risk of language disappearance for speakers in Italy, advocating for a paradigm shift in NLP, but is incremental in its recommendations.
The paper tackles the problem of endangered local languages and dialects in Italy by challenging machine-centric NLP assumptions and advocating for a speaker-centric paradigm, proposing community-building for responsible efforts to support language vitality.
Italy is characterized by a one-of-a-kind linguistic diversity landscape in Europe, which implicitly encodes local knowledge, cultural traditions, artistic expressions and history of its speakers. However, most local languages and dialects in Italy are at risk of disappearing within few generations. The NLP community has recently begun to engage with endangered languages, including those of Italy. Yet, most efforts assume that these varieties are under-resourced language monoliths with an established written form and homogeneous functions and needs, and thus highly interchangeable with each other and with high-resource, standardized languages. In this paper, we introduce the linguistic context of Italy and challenge the default machine-centric assumptions of NLP for Italy's language varieties. We advocate for a shift in the paradigm from machine-centric to speaker-centric NLP, and provide recommendations and opportunities for work that prioritizes languages and their speakers over technological advances. To facilitate the process, we finally propose building a local community towards responsible, participatory efforts aimed at supporting vitality of languages and dialects of Italy.