CLJan 31, 2023

Zero-shot cross-lingual transfer language selection using linguistic similarity

arXiv:2301.13720v161 citationsh-index: 23
Originality Incremental advance
AI Analysis

This addresses the challenge of leveraging high-resource languages for low-resource NLP applications, though it is incremental as it builds on existing cross-lingual transfer methods.

The paper tackled the problem of selecting optimal transfer languages for NLP tasks by using linguistic similarity metrics instead of intuition, showing that this approach correlates with improved cross-lingual transfer performance and statistically outperforms using English as the default source.

We study the selection of transfer languages for different Natural Language Processing tasks, specifically sentiment analysis, named entity recognition and dependency parsing. In order to select an optimal transfer language, we propose to utilize different linguistic similarity metrics to measure the distance between languages and make the choice of transfer language based on this information instead of relying on intuition. We demonstrate that linguistic similarity correlates with cross-lingual transfer performance for all of the proposed tasks. We also show that there is a statistically significant difference in choosing the optimal language as the transfer source instead of English. This allows us to select a more suitable transfer language which can be used to better leverage knowledge from high-resource languages in order to improve the performance of language applications lacking data. For the study, we used datasets from eight different languages from three language families.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes