CLSep 28, 2018

Learning Robust, Transferable Sentence Representations for Text Classification

Wasi Uddin Ahmad, Xueying Bai, Nanyun Peng, Kai-Wei Chang

arXiv:1810.00681v10.75 citations

Originality Incremental advance

AI Analysis

This work addresses data limitations in text classification for NLP practitioners, but it is incremental as it builds on existing pre-training and multi-task learning approaches.

The paper tackled the problem of expensive training and data requirements for deep RNNs in text classification by jointly learning sentence representations from multiple tasks and combining them with pre-trained encoders, resulting in robust representations validated through extensive experiments on transfer and linguistic tasks.

Despite deep recurrent neural networks (RNNs) demonstrate strong performance in text classification, training RNN models are often expensive and requires an extensive collection of annotated data which may not be available. To overcome the data limitation issue, existing approaches leverage either pre-trained word embedding or sentence representation to lift the burden of training RNNs from scratch. In this paper, we show that jointly learning sentence representations from multiple text classification tasks and combining them with pre-trained word-level and sentence level encoders result in robust sentence representations that are useful for transfer learning. Extensive experiments and analyses using a wide range of transfer and linguistic tasks endorse the effectiveness of our approach.

View on arXiv PDF

Similar