CL LGAug 31, 2019

Adversarial Learning with Contextual Embeddings for Zero-resource Cross-lingual Classification and NER

Phillip Keung, Yichao Lu, Vikas Bhardwaj

arXiv:1909.00153v330.51031 citations

Originality Incremental advance

AI Analysis

This work addresses cross-lingual classification and NER for languages without labeled data, but it is incremental as it builds upon existing multilingual BERT methods.

The paper tackled the problem of improving zero-resource cross-lingual performance in NLP tasks by using adversarial learning with multilingual BERT, resulting in enhanced performance on MLDoc text classification and CoNLL 2002/2003 NER tasks with reported magnitude of improvement.

Contextual word embeddings (e.g. GPT, BERT, ELMo, etc.) have demonstrated state-of-the-art performance on various NLP tasks. Recent work with the multilingual version of BERT has shown that the model performs very well in zero-shot and zero-resource cross-lingual settings, where only labeled English data is used to finetune the model. We improve upon multilingual BERT's zero-resource cross-lingual performance via adversarial learning. We report the magnitude of the improvement on the multilingual MLDoc text classification and CoNLL 2002/2003 named entity recognition tasks. Furthermore, we show that language-adversarial training encourages BERT to align the embeddings of English documents and their translations, which may be the cause of the observed performance gains.

View on arXiv PDF

Similar