Self-Translate-Train: Enhancing Cross-Lingual Transfer of Large Language Models via Inherent Capability
This addresses cross-lingual transfer issues for low-resource languages, but it is incremental as it builds on existing fine-tuning approaches.
The paper tackled the problem of misaligned internal representations in zero-shot cross-lingual transfer for low-resource languages by proposing Self-Translate-Train, a method where large language models translate training data into the target language and are fine-tuned on it, resulting in outperformance over zero-shot transfer.
Zero-shot cross-lingual transfer by fine-tuning multilingual pretrained models shows promise for low-resource languages, but often suffers from misalignment of internal representations between languages. We hypothesize that even when the model cannot generalize across languages effectively in fine-tuning, it still captures cross-lingual correspondence useful for cross-lingual transfer. We explore this hypothesis with Self-Translate-Train, a method that lets large language models (LLMs) to translate training data into the target language and fine-tunes the model on its own generated data. By demonstrating that Self-Translate-Train outperforms zero-shot transfer, we encourage further exploration of better methods to elicit cross-lingual capabilities of LLMs.