LGAIDec 7, 2023

Using Large Language Models for Hyperparameter Optimization

NVIDIAU of Toronto
arXiv:2312.04528v296 citationsh-index: 15
Originality Highly original
AI Analysis

This addresses the problem of efficient hyperparameter tuning for machine learning practitioners, offering a novel approach that is incremental in applying LLMs to an existing bottleneck.

The paper tackles hyperparameter optimization by using large language models to suggest configurations, achieving performance that matches or outperforms traditional methods like Bayesian optimization within constrained search budgets.

This paper explores the use of foundational large language models (LLMs) in hyperparameter optimization (HPO). Hyperparameters are critical in determining the effectiveness of machine learning models, yet their optimization often relies on manual approaches in limited-budget settings. By prompting LLMs with dataset and model descriptions, we develop a methodology where LLMs suggest hyperparameter configurations, which are iteratively refined based on model performance. Our empirical evaluations on standard benchmarks reveal that within constrained search budgets, LLMs can match or outperform traditional HPO methods like Bayesian optimization across different models on standard benchmarks. Furthermore, we propose to treat the code specifying our model as a hyperparameter, which the LLM outputs and affords greater flexibility than existing HPO approaches.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes