CL CYNov 2, 2024

Diversidade linguística e inclusão digital: desafios para uma ia brasileira

arXiv:2411.01259v11 citationsh-index: 11Anais da I Conferência Latino-Americana de Ética em Inteligência Artificial (LAAI-Ethics 2024)

Originality Synthesis-oriented

AI Analysis

This addresses the problem of digital exclusion and language loss for speakers of underrepresented languages, highlighting an incremental but critical issue in AI development.

The paper examines how generative AI's reliance on documented languages creates a selection bias that threatens linguistic diversity, leading to a vicious cycle where dominant languages become standardized while others are marginalized.

Linguistic diversity is a human attribute which, with the advance of generative AIs, is coming under threat. This paper, based on the contributions of sociolinguistics, examines the consequences of the variety selection bias imposed by technological applications and the vicious circle of preserving a variety that becomes dominant and standardized because it has linguistic documentation to feed the large language models for machine learning.

View on arXiv PDF

Similar