On the Compositionality Prediction of Noun Phrases using Poincaré Embeddings
This addresses the problem of accurately interpreting multiword expressions for natural language processing applications, representing an incremental advance by enhancing existing methods with hierarchical data.
The paper tackled predicting the compositionality of noun phrases by blending hierarchical hypernymy information encoded via Poincaré embeddings with distributional data, achieving consistent and substantial statistically significant improvements over state-of-the-art models across three datasets.
The compositionality degree of multiword expressions indicates to what extent the meaning of a phrase can be derived from the meaning of its constituents and their grammatical relations. Prediction of (non)-compositionality is a task that has been frequently addressed with distributional semantic models. We introduce a novel technique to blend hierarchical information with distributional information for predicting compositionality. In particular, we use hypernymy information of the multiword and its constituents encoded in the form of the recently introduced Poincaré embeddings in addition to the distributional information to detect compositionality for noun phrases. Using a weighted average of the distributional similarity and a Poincaré similarity function, we obtain consistent and substantial, statistically significant improvement across three gold standard datasets over state-of-the-art models based on distributional information only. Unlike traditional approaches that solely use an unsupervised setting, we have also framed the problem as a supervised task, obtaining comparable improvements. Further, we publicly release our Poincaré embeddings, which are trained on the output of handcrafted lexical-syntactic patterns on a large corpus.