CLLGApr 29, 2020

Exploring Fine-tuning Techniques for Pre-trained Cross-lingual Models via Continual Learning

arXiv:2004.14218v219 citations
AI Analysis

This addresses a specific issue in NLP for researchers and practitioners working with multilingual models, but it is incremental as it builds on existing fine-tuning and continual learning techniques.

The paper tackles the problem of fine-tuning pre-trained cross-lingual models weakening their cross-lingual ability, using continual learning to preserve this ability during fine-tuning, resulting in better performance on sentence retrieval and zero-shot tasks like part-of-speech tagging and named entity recognition.

Recently, fine-tuning pre-trained language models (e.g., multilingual BERT) to downstream cross-lingual tasks has shown promising results. However, the fine-tuning process inevitably changes the parameters of the pre-trained model and weakens its cross-lingual ability, which leads to sub-optimal performance. To alleviate this problem, we leverage continual learning to preserve the original cross-lingual ability of the pre-trained model when we fine-tune it to downstream tasks. The experimental result shows that our fine-tuning methods can better preserve the cross-lingual ability of the pre-trained model in a sentence retrieval task. Our methods also achieve better performance than other fine-tuning baselines on the zero-shot cross-lingual part-of-speech tagging and named entity recognition tasks.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes