CL AIDec 25, 2023

PersianLLaMA: Towards Building First Persian Large Language Model

Mohammad Amin Abbasi, Arash Ghafouri, Mahdi Firouzmandi, Hassan Naderi, Behrouz Minaei Bidgoli

arXiv:2312.15713v15.820 citationsh-index: 38

Originality Synthesis-oriented

AI Analysis

This addresses the problem of limited NLP resources for the Persian-speaking community, representing a foundational but incremental step as it applies existing methods to a new language.

The paper tackles the lack of large language models for Persian by introducing PersianLLaMA, the first such model with 7 and 13 billion parameters, which significantly outperforms competitors in both understanding and generating Persian text.

Despite the widespread use of the Persian language by millions globally, limited efforts have been made in natural language processing for this language. The use of large language models as effective tools in various natural language processing tasks typically requires extensive textual data and robust hardware resources. Consequently, the scarcity of Persian textual data and the unavailability of powerful hardware resources have hindered the development of large language models for Persian. This paper introduces the first large Persian language model, named PersianLLaMA, trained on a collection of Persian texts and datasets. This foundational model comes in two versions, with 7 and 13 billion parameters, trained on formal and colloquial Persian texts using two different approaches. PersianLLaMA has been evaluated for natural language generation tasks based on the latest evaluation methods, namely using larger language models, and for natural language understanding tasks based on automated machine metrics. The results indicate that PersianLLaMA significantly outperforms its competitors in both understanding and generating Persian text. PersianLLaMA marks an important step in the development of Persian natural language processing and can be a valuable resource for the Persian-speaking community. This large language model can be used for various natural language processing tasks, especially text generation like chatbots, question-answering, machine translation, and text summarization

View on arXiv PDF

Similar