DeepPlace: Learning to Place Applications in Multi-Tenant Clusters
This addresses scheduling inefficiencies for operators of large multi-tenant clusters, though it appears incremental as it applies an existing method (deep RL) to a known bottleneck.
The paper tackles the problem of manually creating suboptimal placement rules for scheduling diverse applications in multi-tenant clusters, resulting in DeepPlace, a scheduler that uses deep reinforcement learning to reduce resource competition and optimize cluster utilization.
Large multi-tenant production clusters often have to handle a variety of jobs and applications with a variety of complex resource usage characteristics. It is non-trivial and non-optimal to manually create placement rules for scheduling that would decide which applications should co-locate. In this paper, we present DeepPlace, a scheduler that learns to exploits various temporal resource usage patterns of applications using Deep Reinforcement Learning (Deep RL) to reduce resource competition across jobs running in the same machine while at the same time optimizing for overall cluster utilization.