A Survey on Contextual Embeddings
It provides a comprehensive overview for researchers and practitioners in natural language processing, but is incremental as it summarizes existing work.
This survey reviews existing contextual embedding models, covering cross-lingual pre-training, applications in downstream tasks, model compression, and analyses, highlighting their ground-breaking performance on NLP tasks.
Contextual embeddings, such as ELMo and BERT, move beyond global word representations like Word2Vec and achieve ground-breaking performance on a wide range of natural language processing tasks. Contextual embeddings assign each word a representation based on its context, thereby capturing uses of words across varied contexts and encoding knowledge that transfers across languages. In this survey, we review existing contextual embedding models, cross-lingual polyglot pre-training, the application of contextual embeddings in downstream tasks, model compression, and model analyses.