IRCLNov 15, 2018

Automatic Text Document Summarization using Semantic-based Analysis

arXiv:1811.06567v1
Originality Synthesis-oriented
AI Analysis

This work addresses information overload for internet users by enhancing text summarization, but it appears incremental as it builds on existing extractive methods with added features.

The paper tackles text document summarization to address information overload by proposing an extractive approach that incorporates statistical, semantic, and sentiment features, aiming to improve content coverage and reduce redundancy, though no concrete results or numbers are provided.

Since the advent of the web, the amount of data on wen has been increased several million folds. In recent years web data generated is more than data stored for years. One important data format is text. To answer user queries over the internet, and to overcome the problem of information overload one possible solution is text document summarization. This not only reduces query access time, but also optimize the document results according to specific users requirements. Summarization of text document can be categorized as abstractive and extractive. Most of the work has been done in the direction of Extractive summarization. Extractive summarized result is a subset of original documents with the objective of more content coverage and lea redundancy. Our work is based on Extractive approaches. In the first approach, we are using some statistical features and semantic-based features. To include sentiment as a feature is an idea cached from a view that emotion plays an important role. It effectively conveys a message. So, it may play a vital role in text document summarization.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes