Drift Detection: Introducing Gaussian Split Detector
This addresses a key limitation for real-world applications where labels are unavailable, though it is incremental as it builds on existing drift detection methods.
The paper tackles the problem of detecting drift in data streams without requiring ground truth labels during detection, introducing the Gaussian Split Detector (GSD) that outperforms state-of-the-art methods in distinguishing real from virtual drift.
Recent research yielded a wide array of drift detectors. However, in order to achieve remarkable performance, the true class labels must be available during the drift detection phase. This paper targets at detecting drift when the ground truth is unknown during the detection phase. To that end, we introduce Gaussian Split Detector (GSD) a novel drift detector that works in batch mode. GSD is designed to work when the data follow a normal distribution and makes use of Gaussian mixture models to monitor changes in the decision boundary. The algorithm is designed to handle multi-dimension data streams and to work without the ground truth labels during the inference phase making it pertinent for real world use. In an extensive experimental study on real and synthetic datasets, we evaluate our detector against the state of the art. We show that our detector outperforms the state of the art in detecting real drift and in ignoring virtual drift which is key to avoid false alarms.