LGJun 24, 2014

Mining Recurrent Concepts in Data Streams using the Discrete Fourier Transform

arXiv:1406.6114v119 citations

Originality Incremental advance

AI Analysis

This work addresses the challenge of efficiently re-using classifiers in data stream environments, which is incremental as it builds on existing concept drift methods with a novel compression technique.

The paper tackles the problem of capturing recurring concepts in data streams by applying the Discrete Fourier Transform to Decision Tree classifiers, compressing them at concept drift points for re-use. Empirical results show that this approach achieves higher precision in capturing recurring concepts compared to re-using original classifiers, with Fourier compressed trees being more robust to noise.

In this research we address the problem of capturing recurring concepts in a data stream environment. Recurrence capture enables the re-use of previously learned classifiers without the need for re-learning while providing for better accuracy during the concept recurrence interval. We capture concepts by applying the Discrete Fourier Transform (DFT) to Decision Tree classifiers to obtain highly compressed versions of the trees at concept drift points in the stream and store such trees in a repository for future use. Our empirical results on real world and synthetic data exhibiting varying degrees of recurrence show that the Fourier compressed trees are more robust to noise and are able to capture recurring concepts with higher precision than a meta learning approach that chooses to re-use classifiers in their originally occurring form.

View on arXiv PDF

Similar