CLFeb 12, 2024

Walia-LLM: Enhancing Amharic-LLaMA by Integrating Task-Specific and Generative Datasets

arXiv:2402.08015v525 citationsh-index: 20Has CodeEMNLP
Originality Synthesis-oriented
AI Analysis

This work addresses the gap in NLP for Amharic speakers by providing enhanced models and open-source resources, though it is incremental as it builds on existing LLaMA-2-Amharic.

The researchers tackled the problem of low-resource language performance in large language models by enhancing LLaMA-2-Amharic for Amharic through integration of task-specific and generative datasets, resulting in promising improvements in NLP tasks.

Large language models (LLMs) have received a lot of attention in natural language processing (NLP) research because of their exceptional performance in understanding and generating human languages. However, low-resource languages are left behind due to the unavailability of resources. In this work, we focus on enhancing the LLaMA-2-Amharic model by integrating task-specific and generative datasets to improve language model performance for Amharic. We compile an Amharic instruction fine-tuning dataset and fine-tuned LLaMA-2-Amharic model. The fine-tuned model shows promising results in different NLP tasks. We open-source our dataset creation pipeline, instruction datasets, trained models, and evaluation outputs to promote language-specific studies on these models.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes