CLJun 29, 2024

Self-Translate-Train: Enhancing Cross-Lingual Transfer of Large Language Models via Inherent Capability

arXiv:2407.00454v22 citations
Originality Incremental advance
AI Analysis

This addresses cross-lingual transfer issues for low-resource languages, but it is incremental as it builds on existing fine-tuning approaches.

The paper tackled the problem of misaligned internal representations in zero-shot cross-lingual transfer for low-resource languages by proposing Self-Translate-Train, a method where large language models translate training data into the target language and are fine-tuned on it, resulting in outperformance over zero-shot transfer.

Zero-shot cross-lingual transfer by fine-tuning multilingual pretrained models shows promise for low-resource languages, but often suffers from misalignment of internal representations between languages. We hypothesize that even when the model cannot generalize across languages effectively in fine-tuning, it still captures cross-lingual correspondence useful for cross-lingual transfer. We explore this hypothesis with Self-Translate-Train, a method that lets large language models (LLMs) to translate training data into the target language and fine-tunes the model on its own generated data. By demonstrating that Self-Translate-Train outperforms zero-shot transfer, we encourage further exploration of better methods to elicit cross-lingual capabilities of LLMs.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes