Reinforcement Learning for Optimal Load Distribution Sequencing in Resource-Sharing System
This addresses scheduling efficiency for data-intensive systems using virtualization, but appears incremental as it applies an existing reinforcement learning method to a specific optimization problem.
The paper tackled the problem of determining the optimal sequence for distributing divisible loads to processors in resource-sharing systems to minimize finishing time, using a Multi-armed Bandit reinforcement learning approach with optimizations, achieving global optimum performance when sample sizes were sufficiently large.
Divisible Load Theory (DLT) is a powerful tool for modeling divisible load problems in data-intensive systems. This paper studied an optimal divisible load distribution sequencing problem using a machine learning framework. The problem is to decide the optimal sequence to distribute divisible load to processors in order to achieve minimum finishing time. The scheduling is performed in a resource-sharing system where each physical processor is virtualized to multiple virtual processors. A reinforcement learning method called Multi-armed bandit (MAB) is used for our problem. We first provide a naive solution using the MAB algorithm and then several optimizations are performed. Various numerical tests are conducted. Our algorithm shows an increasing performance during the training progress and the global optimum will be acheived when the sample size is large enough.