CLCYOct 14, 2024

Performance in a dialectal profiling task of LLMs for varieties of Brazilian Portuguese

arXiv:2410.10991v13 citationsh-index: 11STIL
Originality Synthesis-oriented
AI Analysis

This addresses dialectal bias in NLP for Brazilian Portuguese speakers, but it is incremental as it applies existing methods to new data.

The study investigated how large language models (LLMs) discriminate among varieties of Brazilian Portuguese, finding that they reproduce dialectal biases and do not consistently apply sociolinguistic rules across models like GPT-3.5, GPT-4o, Gemini, and Sabi.-2.

Different of biases are reproduced in LLM-generated responses, including dialectal biases. A study based on prompt engineering was carried out to uncover how LLMs discriminate varieties of Brazilian Portuguese, specifically if sociolinguistic rules are taken into account in four LLMs: GPT 3.5, GPT-4o, Gemini, and Sabi.-2. The results offer sociolinguistic contributions for an equity fluent NLP technology.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes