CLAIMar 25, 2024

LLMs Are Few-Shot In-Context Low-Resource Language Learners

arXiv:2403.16512v5103 citationsh-index: 25Has CodeNAACL
Originality Incremental advance
AI Analysis

This work addresses the gap in ICL research for low-resource languages, providing insights and methods to improve performance, though it is incremental as it builds on existing ICL frameworks.

The study tackles the problem of applying in-context learning (ICL) to low-resource languages by evaluating it on 25 low-resource and 7 higher-resource languages, identifying shortcomings in label alignment and introducing a more effective query alignment method, which enhances understanding quality by closing language gaps and aligning semantics.

In-context learning (ICL) empowers large language models (LLMs) to perform diverse tasks in underrepresented languages using only short in-context information, offering a crucial avenue for narrowing the gap between high-resource and low-resource languages. Nonetheless, there is only a handful of works explored ICL for low-resource languages with most of them focusing on relatively high-resource languages, such as French and Spanish. In this work, we extensively study ICL and its cross-lingual variation (X-ICL) on 25 low-resource and 7 relatively higher-resource languages. Our study not only assesses the effectiveness of ICL with LLMs in low-resource languages but also identifies the shortcomings of in-context label alignment, and introduces a more effective alternative: query alignment. Moreover, we provide valuable insights into various facets of ICL for low-resource languages. Our study concludes the significance of few-shot in-context information on enhancing the low-resource understanding quality of LLMs through semantically relevant information by closing the language gap in the target language and aligning the semantics between the targeted low-resource and the high-resource language that the model is proficient in. Our work highlights the importance of advancing ICL research, particularly for low-resource languages. Our code is publicly released at https://github.com/SamuelCahyawijaya/in-context-alignment

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes