Distributional Inclusion Hypothesis for Tensor-based Composition
This work addresses the challenge of extending distributional semantics to compositional structures for natural language processing, but it is incremental as it builds on existing hypotheses and methods.
The paper tackled the problem of measuring entailment between phrases and sentences using tensor-based composition methods, and found that certain tensor models combined with sentence-level metrics achieved the highest performance on entailment datasets.
According to the distributional inclusion hypothesis, entailment between words can be measured via the feature inclusions of their distributional vectors. In recent work, we showed how this hypothesis can be extended from words to phrases and sentences in the setting of compositional distributional semantics. This paper focuses on inclusion properties of tensors; its main contribution is a theoretical and experimental analysis of how feature inclusion works in different concrete models of verb tensors. We present results for relational, Frobenius, projective, and holistic methods and compare them to the simple vector addition, multiplication, min, and max models. The degrees of entailment thus obtained are evaluated via a variety of existing word-based measures, such as Weed's and Clarke's, KL-divergence, APinc, balAPinc, and two of our previously proposed metrics at the phrase/sentence level. We perform experiments on three entailment datasets, investigating which version of tensor-based composition achieves the highest performance when combined with the sentence-level measures.