Improving Human Text Comprehension through Semi-Markov CRF-based Neural Section Title Generation
This work addresses the challenge of improving text comprehension, especially for readers with limited reading abilities or in low-resource communities, by generating effective section titles, though it is incremental as it builds on existing extractive and compression methods.
The paper tackled the problem of generating section titles for long documents in low-resource environments by developing an extractive pipeline that selects salient sentences and compresses them using a Semi-Markov CRF with unsupervised word representations. The results showed competitive performance with high-resource sequence-to-sequence models, strong outperformance in low-resource settings, and in a human study, the titles improved comprehension task speed while maintaining accuracy.
Titles of short sections within long documents support readers by guiding their focus towards relevant passages and by providing anchor-points that help to understand the progression of the document. The positive effects of section titles are even more pronounced when measured on readers with less developed reading abilities, for example in communities with limited labeled text resources. We, therefore, aim to develop techniques to generate section titles in low-resource environments. In particular, we present an extractive pipeline for section title generation by first selecting the most salient sentence and then applying deletion-based compression. Our compression approach is based on a Semi-Markov Conditional Random Field that leverages unsupervised word-representations such as ELMo or BERT, eliminating the need for a complex encoder-decoder architecture. The results show that this approach leads to competitive performance with sequence-to-sequence models with high resources, while strongly outperforming it with low resources. In a human-subject study across subjects with varying reading abilities, we find that our section titles improve the speed of completing comprehension tasks while retaining similar accuracy.