LGCLMay 15, 2025

Learning Virtual Machine Scheduling in Cloud Computing through Language Agents

arXiv:2505.10117v22 citationsh-index: 6
Originality Incremental advance
AI Analysis

This addresses the need for adaptive and interpretable scheduling in cloud services, offering a novel approach that improves upon traditional and learning-based methods, though it is incremental in applying LLMs to a specific domain problem.

The paper tackles the problem of virtual machine scheduling in cloud computing, which is an online dynamic multidimensional bin packing challenge, by proposing a hierarchical language agent framework named MiCo that uses large language models to design heuristics, achieving a 96.9% competitive ratio in large-scale scenarios with over 10,000 virtual machines.

In cloud services, virtual machine (VM) scheduling is a typical Online Dynamic Multidimensional Bin Packing (ODMBP) problem, characterized by large-scale complexity and fluctuating demands. Traditional optimization methods struggle to adapt to real-time changes, domain-expert-designed heuristic approaches suffer from rigid strategies, and existing learning-based methods often lack generalizability and interpretability. To address these limitations, this paper proposes a hierarchical language agent framework named MiCo, which provides a large language model (LLM)-driven heuristic design paradigm for solving ODMBP. Specifically, ODMBP is formulated as a Semi-Markov Decision Process with Options (SMDP-Option), enabling dynamic scheduling through a two-stage architecture, i.e., Option Miner and Option Composer. Option Miner utilizes LLMs to discover diverse and useful non-context-aware strategies by interacting with constructed environments. Option Composer employs LLMs to discover a composing strategy that integrates the non-context-aware strategies with the contextual ones. Extensive experiments on real-world enterprise datasets demonstrate that MiCo achieves a 96.9\% competitive ratio in large-scale scenarios involving more than 10,000 virtual machines. It maintains high performance even under nonstationary request flows and diverse configurations, thus validating its effectiveness in complex and large-scale cloud environments.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes