Distributional Sentence Entailment Using Density Matrices
This work addresses the challenge of formalizing entailment in natural language processing for researchers in computational linguistics, though it is incremental as it builds on existing compositional distributional models.
The paper tackles the problem of modeling sentence-level entailment by extending categorical compositional distributional models to use density operators for word representations, achieving a method that combines lexical entailment via von Neumann entropy with grammatical composition to derive sentence entailment relations.
Categorical compositional distributional model of Coecke et al. (2010) suggests a way to combine grammatical composition of the formal, type logical models with the corpus based, empirical word representations of distributional semantics. This paper contributes to the project by expanding the model to also capture entailment relations. This is achieved by extending the representations of words from points in meaning space to density operators, which are probability distributions on the subspaces of the space. A symmetric measure of similarity and an asymmetric measure of entailment is defined, where lexical entailment is measured using von Neumann entropy, the quantum variant of Kullback-Leibler divergence. Lexical entailment, combined with the composition map on word representations, provides a method to obtain entailment relations on the level of sentences. Truth theoretic and corpus-based examples are provided.