Adaptation Strategies for Automated Machine Learning on Evolving Data
This addresses the challenge of maintaining AutoML performance for users dealing with dynamic data environments, though it is incremental in proposing specific strategies rather than a new paradigm.
The study tackled the problem of AutoML systems adapting to evolving data with concept drift, proposing six adaptation strategies and evaluating them on various AutoML approaches, resulting in empirical insights on their effectiveness across real-world and synthetic data streams.
Automated Machine Learning (AutoML) systems have been shown to efficiently build good models for new datasets. However, it is often not clear how well they can adapt when the data evolves over time. The main goal of this study is to understand the effect of data stream challenges such as concept drift on the performance of AutoML methods, and which adaptation strategies can be employed to make them more robust. To that end, we propose 6 concept drift adaptation strategies and evaluate their effectiveness on different AutoML approaches. We do this for a variety of AutoML approaches for building machine learning pipelines, including those that leverage Bayesian optimization, genetic programming, and random search with automated stacking. These are evaluated empirically on real-world and synthetic data streams with different types of concept drift. Based on this analysis, we propose ways to develop more sophisticated and robust AutoML techniques.