LGMar 25, 2023

CADM: Confusion Model-based Detection Method for Real-drift in Chunk Data Stream

arXiv:2303.16906v19 citationsh-index: 12
Originality Incremental advance
AI Analysis

This addresses concept drift detection for applications like health monitoring and fault diagnosis, but it appears incremental as it builds on existing confusion-based approaches.

The paper tackles the problem of detecting real concept drift in chunk data streams with limited annotations by proposing a confusion model-based detection method that uses both real and pseudo labels to update the model and measure prediction differences via cosine similarity. Experiments show the method achieves low false alarm and false negative rates across different classifiers.

Concept drift detection has attracted considerable attention due to its importance in many real-world applications such as health monitoring and fault diagnosis. Conventionally, most advanced approaches will be of poor performance when the evaluation criteria of the environment has changed (i.e. concept drift), either can only detect and adapt to virtual drift. In this paper, we propose a new approach to detect real-drift in the chunk data stream with limited annotations based on concept confusion. When a new data chunk arrives, we use both real labels and pseudo labels to update the model after prediction and drift detection. In this context, the model will be confused and yields prediction difference once drift occurs. We then adopt cosine similarity to measure the difference. And an adaptive threshold method is proposed to find the abnormal value. Experiments show that our method has a low false alarm rate and false negative rate with the utilization of different classifiers.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes