Capacity Planning and Scheduling for Jobs with Uncertainty in Resource Usage and Duration
This work addresses scheduling challenges for organizations in finance and similar industries where stochastic conditions affect job characteristics, though it appears incremental in its methodological approach.
The paper tackles capacity planning and job scheduling in hybrid cloud/on-prem grid computing environments by addressing uncertainty in resource usage and job duration, achieving much lower peak resource usage compared to manual scheduling without compromising quality-of-service.
Organizations around the world schedule jobs (programs) regularly to perform various tasks dictated by their end users. With the major movement towards using a cloud computing infrastructure, our organization follows a hybrid approach with both cloud and on-prem servers. The objective of this work is to perform capacity planning, i.e., estimate resource requirements, and job scheduling for on-prem grid computing environments. A key contribution of our approach is handling uncertainty in both resource usage and duration of the jobs, a critical aspect in the finance industry where stochastic market conditions significantly influence job characteristics. For capacity planning and scheduling, we simultaneously balance two conflicting objectives: (a) minimize resource usage, and (b) provide high quality-of-service to the end users by completing jobs by their requested deadlines. We propose approximate approaches using deterministic estimators and pair sampling-based constraint programming. Our best approach (pair sampling-based) achieves much lower peak resource usage compared to manual scheduling without compromising on the quality-of-service.