Canonical Trends: Detecting Trend Setters in Web Data
This addresses the challenge of identifying early trend setters in web data mining, which is incremental as it builds on existing trend detection methods.
The paper tackled the problem of detecting web sources that first publish information leading to trends in web data, presenting a method that identifies trend setters and validates it on real technology news feeds.
Much information available on the web is copied, reused or rephrased. The phenomenon that multiple web sources pick up certain information is often called trend. A central problem in the context of web data mining is to detect those web sources that are first to publish information which will give rise to a trend. We present a simple and efficient method for finding trends dominating a pool of web sources and identifying those web sources that publish the information relevant to a trend before others. We validate our approach on real data collected from influential technology news feeds.