CLAIDec 14, 2023

Arabic Mini-ClimateGPT : A Climate Change and Sustainability Tailored Arabic LLM

arXiv:2312.09366v1137 citationsh-index: 35Has CodeEMNLP
Originality Synthesis-oriented
AI Analysis

This addresses the need for accessible climate change education and policy awareness in Arabic-speaking regions, though it is incremental as it builds on existing open-source LLMs with domain-specific fine-tuning.

The authors tackled the lack of climate-specific and Arabic-language capabilities in open-source LLMs by developing a lightweight Arabic Mini-ClimateGPT, which outperformed baseline models in 88.3% of ChatGPT-based evaluations and was preferred by human experts 81.6% of the time.

Climate change is one of the most significant challenges we face together as a society. Creating awareness and educating policy makers the wide-ranging impact of climate change is an essential step towards a sustainable future. Recently, Large Language Models (LLMs) like ChatGPT and Bard have shown impressive conversational abilities and excel in a wide variety of NLP tasks. While these models are close-source, recently alternative open-source LLMs such as Stanford Alpaca and Vicuna have shown promising results. However, these open-source models are not specifically tailored for climate related domain specific information and also struggle to generate meaningful responses in other languages such as, Arabic. To this end, we propose a light-weight Arabic Mini-ClimateGPT that is built on an open-source LLM and is specifically fine-tuned on a conversational-style instruction tuning curated Arabic dataset Clima500-Instruct with over 500k instructions about climate change and sustainability. Further, our model also utilizes a vector embedding based retrieval mechanism during inference. We validate our proposed model through quantitative and qualitative evaluations on climate-related queries. Our model surpasses the baseline LLM in 88.3% of cases during ChatGPT-based evaluation. Furthermore, our human expert evaluation reveals an 81.6% preference for our model's responses over multiple popular open-source models. Our open-source demos, code-base and models are available here https://github.com/mbzuai-oryx/ClimateGPT.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes