CLMay 27, 2023

Why Does Zero-Shot Cross-Lingual Generation Fail? An Explanation and a Solution

arXiv:2305.17325v1233 citations
Originality Incremental advance
AI Analysis

This addresses a specific bottleneck in multilingual AI for natural language generation, offering an incremental improvement over existing methods.

The paper tackled the problem of poor performance in zero-shot cross-lingual generation tasks, showing that fine-tuning leads to language-invariant representations harmful for generation, and proposed regularization and checkpoint selection methods that reduced accidental translation by 68% and improved ROUGE-L scores by 1.5 on average.

Zero-shot cross-lingual transfer is when a multilingual model is trained to perform a task in one language and then is applied to another language. Although the zero-shot cross-lingual transfer approach has achieved success in various classification tasks, its performance on natural language generation tasks falls short in quality and sometimes outputs an incorrect language. In our study, we show that the fine-tuning process learns language invariant representations, which is beneficial for classification tasks but harmful for generation tasks. Motivated by this, we propose a simple method to regularize the model from learning language invariant representations and a method to select model checkpoints without a development set in the target language, both resulting in better generation quality. Experiments on three semantically diverse generation tasks show that our method reduces the accidental translation problem by 68% and improves the ROUGE-L score by 1.5 on average.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes