LGMLJul 4, 2012

On the Detection of Concept Changes in Time-Varying Data Stream by Testing Exchangeability

arXiv:1207.1379v18 citations
Originality Synthesis-oriented
AI Analysis

This work addresses concept drift detection for data stream analysis, presenting incremental improvements to an existing method.

The paper tackles the problem of detecting concept changes in time-varying data streams by analyzing a martingale-based test for exchangeability, showing it approximates the sequential probability ratio test and deriving relationships for threshold values, size, power, and mean delay time, with experimental results demonstrating effectiveness on simulated and benchmark data sets.

A martingale framework for concept change detection based on testing data exchangeability was recently proposed (Ho, 2005). In this paper, we describe the proposed change-detection test based on the Doob's Maximal Inequality and show that it is an approximation of the sequential probability ratio test (SPRT). The relationship between the threshold value used in the proposed test and its size and power is deduced from the approximation. The mean delay time before a change is detected is estimated using the average sample number of a SPRT. The performance of the test using various threshold values is examined on five different data stream scenarios simulated using two synthetic data sets. Finally, experimental results show that the test is effective in detecting changes in time-varying data streams simulated using three benchmark data sets.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes