Autoencoder as Assistant Supervisor: Improving Text Representation for Chinese Social Media Text Summarization
This work addresses the problem of improving text summarization for Chinese social media users and platforms, but it is incremental as it builds on existing Seq2Seq frameworks with a novel supervision technique.
The paper tackles the challenge of generating accurate semantic representations for long and noisy Chinese social media text in abstractive summarization by using a summary autoencoder as an assistant supervisor to Seq2Seq models, achieving state-of-the-art performance on a benchmark dataset.
Most of the current abstractive text summarization models are based on the sequence-to-sequence model (Seq2Seq). The source content of social media is long and noisy, so it is difficult for Seq2Seq to learn an accurate semantic representation. Compared with the source content, the annotated summary is short and well written. Moreover, it shares the same meaning as the source content. In this work, we supervise the learning of the representation of the source content with that of the summary. In implementation, we regard a summary autoencoder as an assistant supervisor of Seq2Seq. Following previous work, we evaluate our model on a popular Chinese social media dataset. Experimental results show that our model achieves the state-of-the-art performances on the benchmark dataset.