Dimensionality on Summarization
This work addresses the problem of generating satisfactory summaries for researchers and developers in natural language processing and multimedia analysis, but it appears incremental as it builds on existing summarization approaches.
The paper tackles the challenge of automatic summarization by proposing a multi-dimensional methodology that classifies existing approaches, investigates fundamental language principles, and extends to multimedia summarization, resulting in a general framework for summarization across texts, pictures, videos, and graphs.
Summarization is one of the key features of human intelligence. It plays an important role in understanding and representation. With rapid and continual expansion of texts, pictures and videos in cyberspace, automatic summarization becomes more and more desirable. Text summarization has been studied for over half century, but it is still hard to automatically generate a satisfied summary. Traditional methods process texts empirically and neglect the fundamental characteristics and principles of language use and understanding. This paper summarizes previous text summarization approaches in a multi-dimensional classification space, introduces a multi-dimensional methodology for research and development, unveils the basic characteristics and principles of language use and understanding, investigates some fundamental mechanisms of summarization, studies the dimensions and forms of representations, and proposes a multi-dimensional evaluation mechanisms. Investigation extends to the incorporation of pictures into summary and to the summarization of videos, graphs and pictures, and then reaches a general summarization framework.