CLLGMay 19, 2020

On the Choice of Auxiliary Languages for Improved Sequence Tagging

arXiv:2005.09389v1998 citations
AI Analysis

This work addresses the challenge of improving sequence tagging performance for NLP practitioners by providing insights into language selection and a novel method, though it is incremental in nature.

The paper tackled the problem of selecting auxiliary languages for sequence tagging by showing that the most related language is not always optimal and that attention-based meta-embeddings can combine embeddings from different languages, achieving new state-of-the-art results for part-of-speech tagging in five languages.

Recent work showed that embeddings from related languages can improve the performance of sequence tagging, even for monolingual models. In this analysis paper, we investigate whether the best auxiliary language can be predicted based on language distances and show that the most related language is not always the best auxiliary language. Further, we show that attention-based meta-embeddings can effectively combine pre-trained embeddings from different languages for sequence tagging and set new state-of-the-art results for part-of-speech tagging in five languages.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes