PF PRMay 13

SPLIT: SymPathy for Large jobs Improves Tail latency

Zhouzi Li, Mor Harchol-Balter, Alan Scheller-Wolf

arXiv:2605.1374911.1

Predicted impact top 41% in PF · last 90 daysOriginality Highly original

AI Analysis

It solves a long-standing open problem in queueing theory for multi-server systems with heavy-tailed workloads, which is relevant for optimizing tail latency in modern computing clusters.

This paper provides the first strongly tail-optimal scheduling policies for the M/G/n queue with heavy-tailed job sizes, showing that giving a small amount of sympathy to large jobs is essential for tail optimality in multi-server systems, unlike single-server systems.

We study the asymptotic response time tail in the M/G/n multi-server queue with heavy-tailed (regularly varying) job sizes, a setting representative of modern computing workloads. For single-server systems, tail optimization is well understood: under heavy-tailed job sizes, policies such as SRPT that strictly prioritize short jobs are strongly tail optimal, and giving any priority to large jobs is harmful. For multi-server systems, the question has been almost entirely open. This paper gives the first strongly tail-optimal scheduling policies for the M/G/n queue with heavy-tailed job sizes. Our central finding is that the multi-server case is intrinsically different from the single-server case: giving a small amount of ``sympathy'' to large jobs is essential for strong tail optimality. We establish strong (or arbitrarily close to strong) tail optimality across the full stability region, both with and without knowledge of job sizes.

View on arXiv PDF

Similar