CLJul 25, 2014

Substitute Based SCODE Word Embeddings in Supervised NLP Tasks

arXiv:1407.6853v11 citations
Originality Incremental advance
AI Analysis

This work addresses the need for better word embeddings in NLP tasks like parsing, but it appears incremental as it builds on existing embedding methods with a novel similarity measure.

The paper tackles the problem of improving word embeddings for supervised NLP tasks by mapping words on a sphere based on substitute distributions, and it achieves state-of-the-art results in multilingual dependency parsing while performing competitively in NER and chunking.

We analyze a word embedding method in supervised tasks. It maps words on a sphere such that words co-occurring in similar contexts lie closely. The similarity of contexts is measured by the distribution of substitutes that can fill them. We compared word embeddings, including more recent representations, in Named Entity Recognition (NER), Chunking, and Dependency Parsing. We examine our framework in multilingual dependency parsing as well. The results show that the proposed method achieves as good as or better results compared to the other word embeddings in the tasks we investigate. It achieves state-of-the-art results in multilingual dependency parsing. Word embeddings in 7 languages are available for public use.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes