CLLGMLAug 16, 2017

Deconvolutional Paragraph Representation Learning

arXiv:1708.04729v379 citations
AI Analysis

This addresses a bottleneck in natural language processing for applications requiring long text sequence representation, though it appears incremental as an alternative to RNNs.

The authors tackled the problem of decreasing sentence quality in RNN-based decoding for long text sequences by proposing a convolutional and deconvolutional autoencoding framework, which they showed empirically to be better at reconstructing and correcting long paragraphs compared to RNNs.

Learning latent representations from long text sequences is an important first step in many natural language processing applications. Recurrent Neural Networks (RNNs) have become a cornerstone for this challenging task. However, the quality of sentences during RNN-based decoding (reconstruction) decreases with the length of the text. We propose a sequence-to-sequence, purely convolutional and deconvolutional autoencoding framework that is free of the above issue, while also being computationally efficient. The proposed method is simple, easy to implement and can be leveraged as a building block for many applications. We show empirically that compared to RNNs, our framework is better at reconstructing and correcting long paragraphs. Quantitative evaluation on semi-supervised text classification and summarization tasks demonstrate the potential for better utilization of long unlabeled text data.

Code Implementations4 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes