DDG-DA: Data Distribution Generation for Predictable Concept Drift Adaptation
This addresses the challenge of adapting machine learning models to predictable changes in data distributions for real-world streaming applications, representing a novel approach beyond traditional detection-based methods.
The paper tackles the problem of predictable concept drift in streaming data by proposing DDG-DA, a method that forecasts future data distributions to generate training samples, resulting in significant performance improvements on models for stock price, electricity load, and solar irradiance forecasting tasks.
In many real-world scenarios, we often deal with streaming data that is sequentially collected over time. Due to the non-stationary nature of the environment, the streaming data distribution may change in unpredictable ways, which is known as concept drift. To handle concept drift, previous methods first detect when/where the concept drift happens and then adapt models to fit the distribution of the latest data. However, there are still many cases that some underlying factors of environment evolution are predictable, making it possible to model the future concept drift trend of the streaming data, while such cases are not fully explored in previous work. In this paper, we propose a novel method DDG-DA, that can effectively forecast the evolution of data distribution and improve the performance of models. Specifically, we first train a predictor to estimate the future data distribution, then leverage it to generate training samples, and finally train models on the generated data. We conduct experiments on three real-world tasks (forecasting on stock price trend, electricity load and solar irradiance) and obtain significant improvement on multiple widely-used models.