Automated Text Summarization Base on Lexicales Chain and graph Using of WordNet and Wikipedia Knowledge Base
This addresses the information overload problem in information retrieval by enhancing multi-document summarization, though it appears incremental as it builds on existing lexical chain methods.
The paper tackles automatic document summarization by developing an algorithm that uses lexical cohesion features from WordNet and Wikipedia to identify word senses, construct lexical chains, detect topics, and select important sentences. Experimental results on DUC01 and DUC02 benchmarks show improved performance compared to state-of-the-art approaches.
The technology of automatic document summarization is maturing and may provide a solution to the information overload problem. Nowadays, document summarization plays an important role in information retrieval. With a large volume of documents, presenting the user with a summary of each document greatly facilitates the task of finding the desired documents. Document summarization is a process of automatically creating a compressed version of a given document that provides useful information to users, and multi-document summarization is to produce a summary delivering the majority of information content from a set of documents about an explicit or implicit main topic. The lexical cohesion structure of the text can be exploited to determine the importance of a sentence/phrase. Lexical chains are useful tools to analyze the lexical cohesion structure in a text .In this paper we consider the effect of the use of lexical cohesion features in Summarization, And presenting a algorithm base on the knowledge base. Ours algorithm at first find the correct sense of any word, Then constructs the lexical chains, remove Lexical chains that less score than other, detects topics roughly from lexical chains, segments the text with respect to the topics and selects the most important sentences. The experimental results on an open benchmark datasets from DUC01 and DUC02 show that our proposed approach can improve the performance compared to sate-of-the-art summarization approaches.