CLDec 16, 2025

Astraea: A State-Aware Scheduling Engine for LLM-Powered Agents

arXiv:2512.14142v11 citationsh-index: 3
Originality Incremental advance
AI Analysis

This addresses the problem of inefficient scheduling for LLM-powered agents, improving end-to-end latency in their workflows, though it is incremental as it builds on existing systems like vLLM.

The paper tackles the mismatch between existing inference systems' per-segment optimization and the multi-stage workflows of LLM-powered agents, proposing Astraea, a state-aware scheduling engine that reduces average Job Completion Time by up to 25.5% compared to baselines.

Large Language Models (LLMs) are increasingly being deployed as intelligent agents. Their multi-stage workflows, which alternate between local computation and calls to external network services like Web APIs, introduce a mismatch in their execution pattern and the scheduling granularity of existing inference systems such as vLLM. Existing systems typically focus on per-segment optimization which prevents them from minimizing the end-to-end latency of the complete agentic workflow, i.e., the global Job Completion Time (JCT) over the entire request lifecycle. To address this limitation, we propose Astraea, a service engine designed to shift the optimization from local segments to the global request lifecycle. Astraea employs a state-aware, hierarchical scheduling algorithm that integrates a request's historical state with future predictions. It dynamically classifies requests by their I/O and compute intensive nature and uses an enhanced HRRN policy to balance efficiency and fairness. Astraea also implements an adaptive KV cache manager that intelligently handles the agent state during I/O waits based on the system memory pressure. Extensive experiments show that Astraea reduces average JCT by up to 25.5\% compared to baseline methods. Moreover, our approach demonstrates strong robustness and stability under high load across various model scales.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes