CL LGOct 8, 2015

Mapping Unseen Words to Task-Trained Embedding Spaces

Pranava Swaroop Madhyastha, Mohit Bansal, Kevin Gimpel, Karen Livescu

arXiv:1510.02387v214.925 citations

Originality Incremental advance

AI Analysis

This addresses errors from out-of-vocabulary words in NLP tasks, but it is incremental as it builds on existing embedding methods.

The paper tackles the problem of handling unseen words in supervised tasks by learning a neural network to map initial embeddings to task-specific spaces, resulting in improved dependency parsing and sentiment analysis.

We consider the supervised training setting in which we learn task-specific word embeddings. We assume that we start with initial embeddings learned from unlabelled data and update them to learn task-specific embeddings for words in the supervised training data. However, for new words in the test set, we must use either their initial embeddings or a single unknown embedding, which often leads to errors. We address this by learning a neural network to map from initial embeddings to the task-specific embedding space, via a multi-loss objective function. The technique is general, but here we demonstrate its use for improved dependency parsing (especially for sentences with out-of-vocabulary words), as well as for downstream improvements on sentiment analysis.

View on arXiv PDF

Similar