LGMay 6

Tree-Structured Synergy of Large Language Models and Bayesian Optimization for Efficient CASH

arXiv:2601.1235549.3h-index: 8
AI Analysis

For AutoML practitioners, LB-MCTS addresses cold-start and high-dimensional CASH spaces by synergizing LLM semantic priors with BO quantitative search.

LB-MCTS integrates LLMs and Bayesian Optimization via a shared Monte Carlo Tree Search state to solve the CASH problem, outperforming BO-based, LLM-based, and hybrid baselines on 104 AMLB datasets.

To lower the expertise barrier in machine learning, the AutoML community has focused on the CASH problem, which jointly automates algorithm selection and hyperparameter tuning. While traditional methods like Bayesian Optimization (BO) struggle with cold-start issues, Large Language Models (LLMs) can mitigate these through semantic priors. However, existing LLM-based optimizers generalize poorly to high-dimensional, structured CASH spaces. In this paper, we propose LB-MCTS, a trajectory-structured optimization framework that uses a Monte Carlo Tree Search tree as a shared state for algorithm selection, hyperparameter refinement, and BO-LLM proposer synergy. Within this shared state, BO provides algorithm-specific surrogate modeling for quantitative search, while the LLM exploits path-aware selective memory to generate semantic proposals and reflections. As the surrogate model improves, a reliability-aware proposer policy adaptively shifts from LLM-driven to BO-driven proposals within a unified search trajectory. Experiments on 104 AMLB datasets demonstrate that LB-MCTS consistently outperforms BO-based, LLM-based, and hybrid baselines.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes