CLJan 24, 2021

A Novel Two-stage Framework for Extracting Opinionated Sentences from News Articles

Rajkumar Pujari, Swara Desai, Niloy Ganguly, Pawan Goyal

arXiv:2101.09743v135.91086 citations

Originality Incremental advance

AI Analysis

This work addresses the need for automated opinion extraction in news analysis, though it appears incremental as it builds on existing methods like Naive Bayes and HITS.

The paper tackles the problem of extracting opinionated sentences from news articles by proposing a two-stage framework that combines a Naive Bayes classifier with the HITS algorithm, resulting in significantly improved precision over the baseline.

This paper presents a novel two-stage framework to extract opinionated sentences from a given news article. In the first stage, Naive Bayes classifier by utilizing the local features assigns a score to each sentence - the score signifies the probability of the sentence to be opinionated. In the second stage, we use this prior within the HITS (Hyperlink-Induced Topic Search) schema to exploit the global structure of the article and relation between the sentences. In the HITS schema, the opinionated sentences are treated as Hubs and the facts around these opinions are treated as the Authorities. The algorithm is implemented and evaluated against a set of manually marked data. We show that using HITS significantly improves the precision over the baseline Naive Bayes classifier. We also argue that the proposed method actually discovers the underlying structure of the article, thus extracting various opinions, grouped with supporting facts as well as other supporting opinions from the article.

View on arXiv PDF

Similar