LGMar 11

Learning to Score: Tuning Cluster Schedulers through Reinforcement Learning

Martin Asenov, Qiwen Deng, Gingfung Yeung, Adam Barker

arXiv:2603.10545v14.5h-index: 22

Predicted impact top 94% in LG · last 90 daysOriginality Incremental advance

AI Analysis

This work addresses the challenge of efficient job scheduling in large-scale clusters for improved utilization and performance, representing an incremental advance in tuning methods.

The paper tackles the problem of sub-optimal job allocation in cluster schedulers by proposing a reinforcement learning approach to tune scoring function weights, resulting in an average performance improvement of 33% over fixed weights and 12% over the best baseline in a lab-based serverless scenario.

Efficiently allocating incoming jobs to nodes in large-scale clusters can lead to substantial improvements in both cluster utilization and job performance. In order to allocate incoming jobs, cluster schedulers usually rely on a set of scoring functions to rank feasible nodes. Results from individual scoring functions are usually weighted equally, which could lead to sub-optimal deployments as the one-size-fits-all solution does not take into account the characteristics of each workload. Tuning the weights of scoring functions, however, requires expert knowledge and is computationally expensive. This paper proposes a reinforcement learning approach for learning the weights in scheduler scoring algorithms with the overall objective of improving the end-to-end performance of jobs for a given cluster. Our approach is based on percentage improvement reward, frame-stacking, and limiting domain information. We propose a percentage improvement reward to address the objective of multi-step parameter tuning. The inclusion of frame-stacking allows for carrying information across an optimization experiment. Limiting domain information prevents overfitting and improves performance in unseen clusters and workloads. The policy is trained on different combinations of workloads and cluster setups. We demonstrate the proposed approach improves performance on average by 33\% compared to fixed weights and 12\% compared to the best-performing baseline in a lab-based serverless scenario.

View on arXiv PDF

Similar