CLAIApr 3, 2023

Efficiently Aligned Cross-Lingual Transfer Learning for Conversational Tasks using Prompt-Tuning

arXiv:2304.01295v4105 citationsh-index: 62
Originality Incremental advance
AI Analysis

This addresses the problem of high data costs and limited coverage for non-English conversational tasks, enabling better cross-lingual transfer, though it is incremental as it builds on existing datasets and methods.

The paper tackled the limited cross-lingual transfer for conversational tasks by creating XSGD, a multilingual dataset with 330k utterances per language across 105 languages, and introduced an efficient prompt-tuning method for alignment, achieving strong improvements in few-shot settings for slot-filling and intent classification.

Cross-lingual transfer of language models trained on high-resource languages like English has been widely studied for many NLP tasks, but focus on conversational tasks has been rather limited. This is partly due to the high cost of obtaining non-English conversational data, which results in limited coverage. In this work, we introduce XSGD for cross-lingual alignment pretraining, a parallel and large-scale multilingual conversation dataset that we created by translating the English-only Schema-Guided Dialogue (SGD) dataset (Rastogi et al., 2020) into 105 other languages. XSGD contains approximately 330k utterances per language. To facilitate aligned cross-lingual representations, we develop an efficient prompt-tuning-based method for learning alignment prompts. We also investigate two different classifiers: NLI-based and vanilla classifiers, and test cross-lingual capability enabled by the aligned prompts. We evaluate our model's cross-lingual generalization capabilities on two conversation tasks: slot-filling and intent classification. Our results demonstrate the strong and efficient modeling ability of NLI-based classifiers and the large cross-lingual transfer improvements achieved by our aligned prompts, particularly in few-shot settings. In addition, we highlight the nice results of our approach compared to LLMs such as text-davinci-003 and ChatGPT in both zero-shot and few-shot settings. While LLMs exhibit impressive performance in English, their cross-lingual capabilities in other languages, particularly low-resource languages, are limited.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes