CLMay 23, 2018

Embedding Syntax and Semantics of Prepositions via Tensor Decomposition

arXiv:1805.09389v11093 citations
Originality Incremental advance
AI Analysis

This addresses difficulties in automatic sentence processing for NLP applications, though it is incremental as it builds on existing embedding methods with a novel geometric approach.

The paper tackled the problem of representing prepositions in natural language processing by capturing their syntactic and semantic interactions using word-triple counts and tensor decomposition, achieving results comparable to or better than state-of-the-art on tasks like preposition selection and disambiguation.

Prepositions are among the most frequent words in English and play complex roles in the syntax and semantics of sentences. Not surprisingly, they pose well-known difficulties in automatic processing of sentences (prepositional attachment ambiguities and idiosyncratic uses in phrases). Existing methods on preposition representation treat prepositions no different from content words (e.g., word2vec and GloVe). In addition, recent studies aiming at solving prepositional attachment and preposition selection problems depend heavily on external linguistic resources and use dataset-specific word representations. In this paper we use word-triple counts (one of the triples being a preposition) to capture a preposition's interaction with its attachment and complement. We then derive preposition embeddings via tensor decomposition on a large unlabeled corpus. We reveal a new geometry involving Hadamard products and empirically demonstrate its utility in paraphrasing phrasal verbs. Furthermore, our preposition embeddings are used as simple features in two challenging downstream tasks: preposition selection and prepositional attachment disambiguation. We achieve results comparable to or better than the state-of-the-art on multiple standardized datasets.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes