CLLGAug 31, 2019

Adversarial Learning with Contextual Embeddings for Zero-resource Cross-lingual Classification and NER

arXiv:1909.00153v31031 citations
AI Analysis

This work addresses cross-lingual classification and NER for languages without labeled data, but it is incremental as it builds upon existing multilingual BERT methods.

The paper tackled the problem of improving zero-resource cross-lingual performance in NLP tasks by using adversarial learning with multilingual BERT, resulting in enhanced performance on MLDoc text classification and CoNLL 2002/2003 NER tasks with reported magnitude of improvement.

Contextual word embeddings (e.g. GPT, BERT, ELMo, etc.) have demonstrated state-of-the-art performance on various NLP tasks. Recent work with the multilingual version of BERT has shown that the model performs very well in zero-shot and zero-resource cross-lingual settings, where only labeled English data is used to finetune the model. We improve upon multilingual BERT's zero-resource cross-lingual performance via adversarial learning. We report the magnitude of the improvement on the multilingual MLDoc text classification and CoNLL 2002/2003 named entity recognition tasks. Furthermore, we show that language-adversarial training encourages BERT to align the embeddings of English documents and their translations, which may be the cause of the observed performance gains.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes