A Remark on Concept Drift for Dependent Data
This work addresses concept drift for dependent data, which is incremental as it modifies existing definitions for a specific setup in machine learning.
The paper tackles the problem of concept drift in dependent data, showing that temporal dependencies strongly influence sampling and that stationarity is unsuitable, proposing alternative formal notions validated in numerical experiments.
Concept drift, i.e., the change of the data generating distribution, can render machine learning models inaccurate. Several works address the phenomenon of concept drift in the streaming context usually assuming that consecutive data points are independent of each other. To generalize to dependent data, many authors link the notion of concept drift to time series. In this work, we show that the temporal dependencies are strongly influencing the sampling process. Thus, the used definitions need major modifications. In particular, we show that the notion of stationarity is not suited for this setup and discuss alternatives. We demonstrate that these alternative formal notions describe the observable learning behavior in numerical experiments.