CL LGFeb 10, 2016

Learning Distributed Representations of Sentences from Unlabelled Data

Felix Hill, Kyunghyun Cho, Anna Korhonen

arXiv:1602.03483v132.2591 citations

Originality Incremental advance

AI Analysis

This work addresses the problem of learning sentence representations for NLP researchers, but it is incremental as it builds on existing word representation methods.

The paper systematically compares models for learning distributed sentence representations from unlabelled data, finding that deeper models are better for supervised systems while shallow models excel for spatial decoding, and proposes two new objectives to balance training time, portability, and performance.

Unsupervised methods for learning distributed representations of words are ubiquitous in today's NLP research, but far less is known about the best ways to learn distributed phrase or sentence representations from unlabelled data. This paper is a systematic comparison of models that learn such representations. We find that the optimal approach depends critically on the intended application. Deeper, more complex models are preferable for representations to be used in supervised systems, but shallow log-linear models work best for building representation spaces that can be decoded with simple spatial distance metrics. We also propose two new unsupervised representation-learning objectives designed to optimise the trade-off between training time, domain portability and performance.

View on arXiv PDF

Similar