DCApr 14, 2025

Dispatching Odyssey: Exploring Performance in Computing Clusters under Real-world Workloads

arXiv:2504.101841 citationsh-index: 5
Originality Synthesis-oriented
AI Analysis

For cluster operators and researchers, this work provides practical insights into dispatching policy performance under real-world workloads, challenging conventional wisdom about the superiority of size-based policies.

This paper uses data-driven simulations with Google workload traces to evaluate dispatching policies in computing clusters, finding that Join Idle Queue (JIQ) can outperform Least Work Left (LWL) under realistic conditions, and that a two-stage scheduling approach with service thresholds can further improve performance.

Recent workload measurements in Google data centers provide an opportunity to challenge existing models and, more broadly, to enhance the understanding of dispatching policies in computing clusters. Through extensive data-driven simulations, we aim to highlight the key features of workload traffic traces that influence response time performance under simple yet representative dispatching policies. For a given computational power budget, we vary the cluster size, i.e., the number of available servers. A job-level analysis reveals that Join Idle Queue (JIQ) and Least Work Left (LWL) exhibit an optimal working point for a fixed utilization coefficient as the number of servers is varied, whereas Round Robin (RR) demonstrates monotonously worsening performance. Additionally, we explore the accuracy of simple G/G queue approximations. When decomposing jobs into tasks, interesting results emerge; notably, the simpler, non-size-based policy JIQ appears to outperform the more "powerful" size-based LWL policy. Complementing these findings, we present preliminary results on a two-stage scheduling approach that partitions tasks based on service thresholds, illustrating that modest architectural modifications can further enhance performance under realistic workload conditions. We provide insights into these results and suggest promising directions for fully explaining the observed phenomena.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes