CLLGFeb 10, 2016

Learning Distributed Representations of Sentences from Unlabelled Data

arXiv:1602.03483v1591 citations
AI Analysis

This work addresses the problem of learning sentence representations for NLP researchers, but it is incremental as it builds on existing word representation methods.

The paper systematically compares models for learning distributed sentence representations from unlabelled data, finding that deeper models are better for supervised systems while shallow models excel for spatial decoding, and proposes two new objectives to balance training time, portability, and performance.

Unsupervised methods for learning distributed representations of words are ubiquitous in today's NLP research, but far less is known about the best ways to learn distributed phrase or sentence representations from unlabelled data. This paper is a systematic comparison of models that learn such representations. We find that the optimal approach depends critically on the intended application. Deeper, more complex models are preferable for representations to be used in supervised systems, but shallow log-linear models work best for building representation spaces that can be decoded with simple spatial distance metrics. We also propose two new unsupervised representation-learning objectives designed to optimise the trade-off between training time, domain portability and performance.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes