CLSep 18, 2018

Learning Universal Sentence Representations with Mean-Max Attention Autoencoder

arXiv:1809.06590v11101 citations
Originality Incremental advance
AI Analysis

This work addresses the need for efficient and high-quality unsupervised sentence representations for natural language processing tasks, though it is incremental as it builds on existing encoder-decoder and attention mechanisms.

The paper tackles the problem of learning universal sentence representations by proposing a mean-max attention autoencoder that uses MultiHead self-attention and a mean-max pooling strategy, achieving state-of-the-art performance on 10 transfer tasks and reducing training time compared to traditional recurrent neural networks.

In order to learn universal sentence representations, previous methods focus on complex recurrent neural networks or supervised learning. In this paper, we propose a mean-max attention autoencoder (mean-max AAE) within the encoder-decoder framework. Our autoencoder rely entirely on the MultiHead self-attention mechanism to reconstruct the input sequence. In the encoding we propose a mean-max strategy that applies both mean and max pooling operations over the hidden vectors to capture diverse information of the input. To enable the information to steer the reconstruction process dynamically, the decoder performs attention over the mean-max representation. By training our model on a large collection of unlabelled data, we obtain high-quality representations of sentences. Experimental results on a broad range of 10 transfer tasks demonstrate that our model outperforms the state-of-the-art unsupervised single methods, including the classical skip-thoughts and the advanced skip-thoughts+LN model. Furthermore, compared with the traditional recurrent neural network, our mean-max AAE greatly reduce the training time.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes