CLMay 28, 2017

Subject Specific Stream Classification Preprocessing Algorithm for Twitter Data Stream

arXiv:1705.09995v12 citations
Originality Synthesis-oriented
AI Analysis

This addresses the need for more efficient and relevant data mining on Twitter streams, but appears incremental as it focuses on preprocessing for existing methods.

The paper tackles the problem of inefficient data mining on Twitter due to diverse subjects by proposing an algorithm to classify the entire stream into exclusive streams, enabling separate data mining for more relevant results with high efficiency.

Micro-blogging service Twitter is a lucrative source for data mining applications on global sentiment. But due to the omnifariousness of the subjects mentioned in each data item; it is inefficient to run a data mining algorithm on the raw data. This paper discusses an algorithm to accurately classify the entire stream in to a given number of mutually exclusive collectively exhaustive streams upon each of which the data mining algorithm can be run separately yielding more relevant results with a high efficiency.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes