Learning Parameter Distributions to Detect Concept Drift in Data Streams
This addresses the challenge of maintaining predictive quality in online learning for streaming data environments, though it appears incremental as it builds on existing detection approaches with a novel parameter distribution method.
The authors tackled the problem of detecting concept drift in data streams by proposing a model-agnostic framework called ERICS, which treats model parameters as random variables and uses information theory measures, showing it identifies drift more effectively and precisely than existing methods on synthetic and real-world datasets.
Data distributions in streaming environments are usually not stationary. In order to maintain a high predictive quality at all times, online learning models need to adapt to distributional changes, which are known as concept drift. The timely and robust identification of concept drift can be difficult, as we never have access to the true distribution of streaming data. In this work, we propose a novel framework for the detection of real concept drift, called ERICS. By treating the parameters of a predictive model as random variables, we show that concept drift corresponds to a change in the distribution of optimal parameters. To this end, we adopt common measures from information theory. The proposed framework is completely model-agnostic. By choosing an appropriate base model, ERICS is also capable to detect concept drift at the input level, which is a significant advantage over existing approaches. An evaluation on several synthetic and real-world data sets suggests that the proposed framework identifies concept drift more effectively and precisely than various existing works.