IRJun 19, 2013

Hourly Traffic Prediction of News Stories

arXiv:1306.4608v122 citations
Originality Synthesis-oriented
AI Analysis

This addresses the problem of forecasting news popularity for producers and readers, but it is incremental as it applies existing methods to a specific dataset.

The paper tackled predicting hourly clicks on news stories using a combination of additive regression and bagging with M5P trees, achieving a mean relative error of 11.99% and placing 4th out of 26 participants in a competition.

The process of predicting news stories popularity from several news sources has become a challenge of great importance for both news producers and readers. In this paper, we investigate methods for automatically predicting the number of clicks on a news story during one hour. Our approach is a combination of additive regression and bagging applied over a M5P regression tree using a logarithmic scale (log10). The features included are social-based (social network metadata from Facebook), content-based (automatically extracted keyphrases, and stylometric statistics from news titles), and time-based. In 1st Sapo Data Challenge we obtained 11.99% as mean relative error value which put us in the 4th place out of 26 participants.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes