CLAIAug 27, 2023

Empowering Cross-lingual Abilities of Instruction-tuned Large Language Models by Translation-following demonstrations

arXiv:2308.14186v149 citationsh-index: 18
Originality Incremental advance
AI Analysis

This addresses the cross-lingual performance gap in LLMs for non-English speakers, but it is incremental as it builds on existing instruction-tuning methods.

The paper tackles the problem of unbalanced language abilities in instruction-tuned large language models (LLMs) towards English by proposing CrossAlpaca, which uses translation-following demonstrations to improve semantic alignment across languages. The result shows that this approach outperforms models tuned on monolingual data on multilingual benchmarks like XQUAD and MLQA, tested over six languages.

The language ability of Large Language Models (LLMs) is often unbalanced towards English because of the imbalance in the distribution of the pre-training data. This disparity is demanded in further fine-tuning and affecting the cross-lingual abilities of LLMs. In this paper, we propose to empower Instructiontuned LLMs (It-LLMs) in languages other than English by building semantic alignment between them. Hence, we propose CrossAlpaca, an It-LLM with cross-lingual instruction-following and Translation-following demonstrations to improve semantic alignment between languages. We validate our approach on the multilingual Question Answering (QA) benchmarks XQUAD and MLQA and adapted versions of MMLU and BBH. Our models, tested over six different languages, outperform the It-LLMs tuned on monolingual data. The final results show that instruction tuning on non-English data is not enough and that semantic alignment can be further improved by Translation-following demonstrations.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes