LG AI NEAug 19, 2025

Dynamic Design of Machine Learning Pipelines via Metalearning

Edesio Alcobaça, André C. P. L. F. de Carvalho

arXiv:2508.13436v14.1h-index: 5

Originality Incremental advance

AI Analysis

This work addresses efficiency and overfitting issues in AutoML for users seeking faster and more reliable automated machine learning pipelines.

The paper tackles the high computational cost and overfitting in AutoML by introducing a metalearning method that dynamically designs search spaces, reducing runtime by 89% in Random Search and search space by 1.8/13 for preprocessors and 4.3/16 for classifiers without significant performance loss.

Automated machine learning (AutoML) has democratized the design of machine learning based systems, by automating model selection, hyperparameter tuning and feature engineering. However, the high computational cost associated with traditional search and optimization strategies, such as Random Search, Particle Swarm Optimization and Bayesian Optimization, remains a significant challenge. Moreover, AutoML systems typically explore a large search space, which can lead to overfitting. This paper introduces a metalearning method for dynamically designing search spaces for AutoML system. The proposed method uses historical metaknowledge to select promising regions of the search space, accelerating the optimization process. According to experiments conducted for this study, the proposed method can reduce runtime by 89\% in Random Search and search space by (1.8/13 preprocessor and 4.3/16 classifier), without compromising significant predictive performance. Moreover, the proposed method showed competitive performance when adapted to Auto-Sklearn, reducing its search space. Furthermore, this study encompasses insights into meta-feature selection, meta-model explainability, and the trade-offs inherent in search space reduction strategies.

View on arXiv PDF

Similar