CL AI LGMar 7, 2018

An efficient framework for learning sentence representations

arXiv:1803.02893v121.7576 citations

Originality Incremental advance

AI Analysis

This provides an efficient solution for NLP practitioners needing high-quality sentence embeddings, though it is incremental as it builds on existing distributional and representation learning ideas.

The authors tackled the problem of learning sentence representations from unlabeled data by reformulating context prediction as a classification task, resulting in representations that outperform state-of-the-art methods on NLP tasks with an order of magnitude speedup in training time.

In this work we propose a simple and efficient framework for learning sentence representations from unlabelled data. Drawing inspiration from the distributional hypothesis and recent work on learning sentence representations, we reformulate the problem of predicting the context in which a sentence appears as a classification problem. Given a sentence and its context, a classifier distinguishes context sentences from other contrastive sentences based on their vector representations. This allows us to efficiently learn different types of encoding functions, and we show that the model learns high-quality sentence representations. We demonstrate that our sentence representations outperform state-of-the-art unsupervised and supervised representation learning methods on several downstream NLP tasks that involve understanding sentence semantics while achieving an order of magnitude speedup in training time.

View on arXiv PDF

Similar