ML LGJul 25, 2017

Concept Drift Detection and Adaptation with Hierarchical Hypothesis Testing

Shujian Yu, Zubin Abraham, Heng Wang, Mohak Shah, Yantao Wei, José C. Príncipe

arXiv:1707.07821v69.057 citations

Originality Incremental advance

AI Analysis

This addresses performance deterioration in streaming classification models due to changing data distributions, offering an incremental improvement over existing drift detection and adaptation techniques.

The paper tackles concept drift in streaming classification by introducing a hierarchical hypothesis testing framework and HLFR detector, which outperforms state-of-the-art methods in detection precision, delay, and adaptability across drift types.

A fundamental issue for statistical classification models in a streaming environment is that the joint distribution between predictor and response variables changes over time (a phenomenon also known as concept drifts), such that their classification performance deteriorates dramatically. In this paper, we first present a hierarchical hypothesis testing (HHT) framework that can detect and also adapt to various concept drift types (e.g., recurrent or irregular, gradual or abrupt), even in the presence of imbalanced data labels. A novel concept drift detector, namely Hierarchical Linear Four Rates (HLFR), is implemented under the HHT framework thereafter. By substituting a widely-acknowledged retraining scheme with an adaptive training strategy, we further demonstrate that the concept drift adaptation capability of HLFR can be significantly boosted. The theoretical analysis on the Type-I and Type-II errors of HLFR is also performed. Experiments on both simulated and real-world datasets illustrate that our methods outperform state-of-the-art methods in terms of detection precision, detection delay as well as the adaptability across different concept drift types.

View on arXiv PDF

Similar