CLJun 9, 2017

Learning to Embed Words in Context for Syntactic Tasks

arXiv:1706.02807v239.21091 citations

Originality Incremental advance

AI Analysis

This work addresses the challenge of enhancing syntactic analysis tools for natural language processing, though it appears incremental as it builds on standard neural architectures.

The paper tackled the problem of embedding words in context for syntactic tasks by developing token embedding models from unannotated text, which improved part-of-speech tagging and dependency parsing performance over baselines across various conditions.

We present models for embedding words in the context of surrounding words. Such models, which we refer to as token embeddings, represent the characteristics of a word that are specific to a given context, such as word sense, syntactic category, and semantic role. We explore simple, efficient token embedding models based on standard neural network architectures. We learn token embeddings on a large amount of unannotated text and evaluate them as features for part-of-speech taggers and dependency parsers trained on much smaller amounts of annotated data. We find that predictors endowed with token embeddings consistently outperform baseline predictors across a range of context window and training set sizes.

View on arXiv PDF

Similar