Bringing Structure into Summaries: Crowdsourcing a Benchmark Corpus of Concept Maps
This work addresses a gap in summarization research by providing a structured dataset for concept map-based summaries, though it is incremental as it focuses on a specific variant and domain.
The authors tackled the lack of evaluation datasets for multi-document summarization using concept maps by creating a benchmark corpus through crowdsourcing, resulting in a released corpus with a baseline system and evaluation protocol for educational web documents.
Concept maps can be used to concisely represent important information and bring structure into large document collections. Therefore, we study a variant of multi-document summarization that produces summaries in the form of concept maps. However, suitable evaluation datasets for this task are currently missing. To close this gap, we present a newly created corpus of concept maps that summarize heterogeneous collections of web documents on educational topics. It was created using a novel crowdsourcing approach that allows us to efficiently determine important elements in large document collections. We release the corpus along with a baseline system and proposed evaluation protocol to enable further research on this variant of summarization.