From Standard Summarization to New Tasks and Beyond: Summarization with Manifold Information
This is an incremental survey paper that identifies and reviews new summarization tasks for researchers and practitioners dealing with non-plain text data.
The paper surveys emerging summarization tasks that go beyond plain text, such as summarizing web pages based on queries, long documents, and dialog histories, to address real-world applications where data is not in plain text format.
Text summarization is the research area aiming at creating a short and condensed version of the original document, which conveys the main idea of the document in a few words. This research topic has started to attract the attention of a large community of researchers, and it is nowadays counted as one of the most promising research areas. In general, text summarization algorithms aim at using a plain text document as input and then output a summary. However, in real-world applications, most of the data is not in a plain text format. Instead, there is much manifold information to be summarized, such as the summary for a web page based on a query in the search engine, extreme long document (e.g., academic paper), dialog history and so on. In this paper, we focus on the survey of these new summarization tasks and approaches in the real-world application.