CLFeb 6, 2018

Byte-Level Recursive Convolutional Auto-Encoder for Text

arXiv:1802.01817v15 citations
Originality Incremental advance
AI Analysis

This addresses scalable and homogeneous text generation for natural language processing, though it appears incremental as it builds on existing auto-encoder and convolutional methods.

The paper tackled the problem of non-sequential text generation at byte-level using a recursive convolutional auto-encoder, achieving better auto-encoding results than recurrent networks on large-scale datasets in multiple languages.

This article proposes to auto-encode text at byte-level using convolutional networks with a recursive architecture. The motivation is to explore whether it is possible to have scalable and homogeneous text generation at byte-level in a non-sequential fashion through the simple task of auto-encoding. We show that non-sequential text generation from a fixed-length representation is not only possible, but also achieved much better auto-encoding results than recurrent networks. The proposed model is a multi-stage deep convolutional encoder-decoder framework using residual connections, containing up to 160 parameterized layers. Each encoder or decoder contains a shared group of modules that consists of either pooling or upsampling layers, making the network recursive in terms of abstraction levels in representation. Results for 6 large-scale paragraph datasets are reported, in 3 languages including Arabic, Chinese and English. Analyses are conducted to study several properties of the proposed model.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes