A Multivariate Model for Representing Semantic Non-compositionality
This addresses the challenge of accurately detecting non-compositional phrases for NLP applications, representing an incremental improvement over prior methods.
The paper tackled the problem of identifying semantically non-compositional phrases in NLP by developing a model that incorporates multiple characteristics like statistical association and non-substitutability, resulting in a model that remarkably outperforms existing ones that focus on single characteristics.
Semantically non-compositional phrases constitute an intriguing research topic in Natural Language Processing. Semantic non-compositionality --the situation when the meaning of a phrase cannot be derived from the meaning of its components, is the main characteristic of such phrases, however, they bear other characteristics such as high statistical association and non-substitutability. In this work, we present a model for identifying non-compositional phrases that takes into account all of these characteristics. We show that the presented model remarkably outperforms the existing models of identifying non-compositional phrases that mostly focus only on one of these characteristics.