Discriminative Phrase Embedding for Paraphrase Identification
This addresses the problem of improving sentence representation for paraphrase identification, which is incremental as it builds on existing embedding methods with task-specific enhancements.
The paper tackled paraphrase identification by expanding deep learning embeddings to include continuous and discontinuous phrases and introducing a TF-KLD-KNN scheme to learn discriminative weights for words and phrases, achieving competitive state-of-the-art performance.
This work, concerning paraphrase identification task, on one hand contributes to expanding deep learning embeddings to include continuous and discontinuous linguistic phrases. On the other hand, it comes up with a new scheme TF-KLD-KNN to learn the discriminative weights of words and phrases specific to paraphrase task, so that a weighted sum of embeddings can represent sentences more effectively. Based on these two innovations we get competitive state-of-the-art performance on paraphrase identification.