Evaluating LLMs for Combinatorial Optimization: One-Phase and Two-Phase Heuristics for 2D Bin-Packing
This work addresses the problem of assessing LLM capabilities in specialized optimization domains for researchers and practitioners, though it appears incremental as it builds on existing LLM and evolutionary algorithm approaches.
This paper tackles the problem of evaluating Large Language Models (LLMs) for combinatorial optimization, specifically 2D bin-packing, by introducing a framework that combines LLMs with evolutionary algorithms. The result shows that GPT-4o achieves optimal solutions within two iterations, reducing average bin usage from 16 to 15 bins and improving space utilization from 0.76-0.78 to 0.83.
This paper presents an evaluation framework for assessing Large Language Models' (LLMs) capabilities in combinatorial optimization, specifically addressing the 2D bin-packing problem. We introduce a systematic methodology that combines LLMs with evolutionary algorithms to generate and refine heuristic solutions iteratively. Through comprehensive experiments comparing LLM generated heuristics against traditional approaches (Finite First-Fit and Hybrid First-Fit), we demonstrate that LLMs can produce more efficient solutions while requiring fewer computational resources. Our evaluation reveals that GPT-4o achieves optimal solutions within two iterations, reducing average bin usage from 16 to 15 bins while improving space utilization from 0.76-0.78 to 0.83. This work contributes to understanding LLM evaluation in specialized domains and establishes benchmarks for assessing LLM performance in combinatorial optimization tasks.