Precise Change Point Detection using Spectral Drift Detection
This addresses the issue of model inaccuracy due to distribution changes over time for machine learning practitioners, though it is incremental as it builds on existing unsupervised methods.
The paper tackles the problem of detecting concept drift change points in unsupervised learning by developing a new algorithm based on spectral properties of kernel embeddings, which reduces false positives and handles multiple drift events, showing improved performance in experiments.
The notion of concept drift refers to the phenomenon that the data generating distribution changes over time; as a consequence machine learning models may become inaccurate and need adjustment. In this paper we consider the problem of detecting those change points in unsupervised learning. Many unsupervised approaches rely on the discrepancy between the sample distributions of two time windows. This procedure is noisy for small windows, hence prone to induce false positives and not able to deal with more than one drift event in a window. In this paper we rely on structural properties of drift induced signals, which use spectral properties of kernel embedding of distributions. Based thereon we derive a new unsupervised drift detection algorithm, investigate its mathematical properties, and demonstrate its usefulness in several experiments.