LGFeb 13, 2025

Bayesian Optimization for Simultaneous Selection of Machine Learning Algorithms and Hyperparameters on Shared Latent Space

Kazuki Ishikawa, Ryota Ozaki, Yohei Kanzaki, Ichiro Takeuchi, Masayuki Karasuyama

arXiv:2502.09329v14.1h-index: 20KDD

Originality Highly original

AI Analysis

This study addresses the problem of optimizing machine learning algorithm selection and hyperparameter tuning for developers of high-performance ML systems, providing an incremental improvement over existing Bayesian optimization approaches.

The authors tackled the problem of selecting the optimal combination of a machine learning algorithm and its hyperparameters, achieving efficient optimization with a smaller number of total observations. Their proposed method demonstrated effectiveness through datasets from OpenML.

Selecting the optimal combination of a machine learning (ML) algorithm and its hyper-parameters is crucial for the development of high-performance ML systems. However, since the combination of ML algorithms and hyper-parameters is enormous, the exhaustive validation requires a significant amount of time. Many existing studies use Bayesian optimization (BO) for accelerating the search. On the other hand, a significant difficulty is that, in general, there exists a different hyper-parameter space for each one of candidate ML algorithms. BO-based approaches typically build a surrogate model independently for each hyper-parameter space, by which sufficient observations are required for all candidate ML algorithms. In this study, our proposed method embeds different hyper-parameter spaces into a shared latent space, in which a surrogate multi-task model for BO is estimated. This approach can share information of observations from different ML algorithms by which efficient optimization is expected with a smaller number of total observations. We further propose the pre-training of the latent space embedding with an adversarial regularization, and a ranking model for selecting an effective pre-trained embedding for a given target dataset. Our empirical study demonstrates effectiveness of the proposed method through datasets from OpenML.

View on arXiv PDF

Similar