Syntax-based Deep Matching of Short Texts
This addresses a hard matching problem in NLP for applications like social media, but it appears incremental as it builds on existing dependency tree and neural network methods.
The paper tackles the problem of matching short texts, such as tweets and responses, by proposing DeepMatch_tree, which combines pattern mining in dependency trees with a deep neural network, achieving large-margin improvements over competitor models.
Many tasks in natural language processing, ranging from machine translation to question answering, can be reduced to the problem of matching two sentences or more generally two short texts. We propose a new approach to the problem, called Deep Match Tree (DeepMatch$_{tree}$), under a general setting. The approach consists of two components, 1) a mining algorithm to discover patterns for matching two short-texts, defined in the product space of dependency trees, and 2) a deep neural network for matching short texts using the mined patterns, as well as a learning algorithm to build the network having a sparse structure. We test our algorithm on the problem of matching a tweet and a response in social media, a hard matching problem proposed in [Wang et al., 2013], and show that DeepMatch$_{tree}$ can outperform a number of competitor models including one without using dependency trees and one based on word-embedding, all with large margins